AST and Grammar Design Discussion¶
These are open (design) questions that should be answered by the thesis.
Structural and Visual Aspects¶
Currently the validation grammar and the visual grammar are stored in the same model. This is bad and must change, but whats a better approach?
Apart from beeing visually pleasing in its formal grammar representation it must also be meaningfully convertible in HTML. This sadly excludes the simplest possibility of simply inserting virtual linebreaks, because this doesn’t work nicely with CSS flexboxes, see minimal_indent.html
. for an example how this fails. A more or less straightforward HTML layout is proposed at row-col.html
.
Example grammars (whithout visual aspects)¶
- Can’t be meaningfully transformed into a block language, terminal symbols are missing
grammar "xml_1" {
node "xml"."element" {
prop "name" { string }
children "elements" ::= element*
children "attributes" ::= attribute*
}
node "xml"."attribute" {
prop "name" { string }
prop "value" { string }
}
}
grammar "json_1" {
typedef "json"."value" ::= string | number | boolean | object | array | null
node "json"."string" {
prop "value" { string }
}
node "json"."number" {
prop "value" { integer }
}
node "json"."boolean" {
prop "value" { boolean }
}
node "json"."object" {
children allowed "values" ::= key-value*
}
node "json"."key-value" {
children allowed "key" ::= string
children allowed "value" ::= value
}
node "json"."array" {
children allowed "values" ::= value*
}
node "json"."null" { }
}
Example XML grammar (with terminal symbols, current state)c¶
- Helpful: Syntax-aspects of XML are now known to the block editor
- Terminal symbols not enough information to turn into a block language: * Missing structural information line breaks or “horizontal” / “vertical” layouts. * No possibility to define “separators” between children
- Therefore:
children
command addsseperator
grammar "xml_2" {
node "xml"."element" {
terminal "tag-open-begin" "<"
prop "name" { string }
children "attributes" ::= attribute*
terminal "tag-open-end" ">"
children "elements" ::= element*
terminal "tag-close" "<name/>"
}
node "xml"."attribute" {
prop "name" { string }
terminal "equals" "="
terminal "quot-begin" "\"
prop "value" { string }
terminal "quot-end" "\""
}
}
Idea: Seperate definition for “Visual Grammar”¶
- Is a visual grammar and must provide visualization for all instances of
node
mentioned in the visualized language. - Adds a new type of command called
block
- Allows to interpolate properties using
{{ }}
- Inserts children using the
{{#children}}
directive. - Problem: Nesting of
row
elements not straightforward.
grammar "xml_3" visualizes "xml_1" {
block "xml"."attribute" {
<row>{{name}}="{{value}}"</row>
}
block "xml"."element" {
<row><{{name}}{{#children attributes, sep=" "}}></row>
<indent>{{#children elements}}</indent>
<row><{{name}}></row>
}
}
Merging grammars¶
Sometimes languages are interweaved with one another, especially on the: HTML
may contain CSS
and JavaScript
, many languages allow embedded JSON
structures. It would possibly save lots of effort to allow grammars to be combined, e.g. to use the same CSS
and JavaScript
grammars that are already provided when describing HTML
.
On a fundamental level, the grammars and the syntaxtrees have already been designed with this merging in mind: The language
namespace is part of all definitions. The tricky part is the actual connection: How do we
Linked Trees¶
References between code fragments happen all the time: HTML
documents reference CSS
stylesheets, JavaScript
files or other HTML
documents, Ruby
code requires
other files, C
loads them using #include
… The same should be possible with syntaxtrees in a hopefully generic manner, so that all block editors can either display the referenced resource inline or at least allow navigation to it.
Drop Target Resolution¶
- Each
block
introduces a new drop target, dropping something on it could mean “insert in here” or “append here”.- Especially tricky with constructs like
if
were both operations are sensible.
- Especially tricky with constructs like
- Current default:
- Dropping on a block prioritizes the “append” operation, insertion happens only on demand
* Great for
SELECT
ofSQL
: Children have different type then siblings children
introduce drop targets, may or may not be allowed to be empty
- Dropping on a block prioritizes the “append” operation, insertion happens only on demand
* Great for
Implemented drop strategies¶
allowExact
- Allows a drop if the given drop location allows the insert, great (and default) for purposefully inserted drop markers.
allowEmbrace
- Allow the dropped thing to “embrace” the node at the given location, effectively replacing it. This is great for things like parentheses and unary or binary expressions, but can lead to bad conflicts with e.g. function calls (which are of course the general case of unary or binary expressions).
allowReplacement
- Allow the dropped thing to take the place of the node at the given deletion, effectively deleting it. This is useful if a location is a hole of length 1, e.g. replacing is the only syntactically sound option.
allowAppend
- Treat the drop as if it happened somewhere after the drop location (on a sibling level, not the child level). This is basically the default behavior of almost any visual and it is useful for lists of statements in imperative programming languages or lists of tables in SQL or lists of list items in JSON, …
allowAnyParent
- Walks up the tree and checks all child groups of each parent whether an insertion would be possible. This is helpful in quite strongly typed grammars. In the current implementation of SQL it e.g. allows to drop the
SQL`-components (``SELECT
,FROM
,WHERE
,GROUP BY
, …) virtually anywhere, because there is exactly one meaningful place that they could fit. In less strictly typed grammars this is probably not as useful.
Common drop problems and ambiguities¶
- Appending vs Embracing in expressions in Lists
- If a non-leaf expression appears in a list of expressions, dropping something on that expression could mean
append
(add a new expression afterwards),embrace
(e.g. negating the expression) orinsertAtChild
(e.g. adding a function call argument). The last option is not currently implemented as a strategy, because children are inserted using holes andallowExact
. This strategy gets tricky however if e.g. a function (likeCOUNT
inSQL
or every function call inJavaScript
) can take any number of arguments. The ambiguity regarding this can be reduced with a stronger type system. - Missing root nodes
Synthetic nodes are a tricky thing to display, but are usually required at the root level. The “visual” roots of an
SQL
statement are eitherSELECT
,INSERT
,UPDATE
orDELETE
, but these components are actually child nodes of anquerySelect
,queryInsert
, … But the user doesn’t want to drop those synthetic nodes ever, so there are two possibilities:- Create the synthetic root node together with the tree and never allow the user to change or delete it.
- Don’t actually drop a single node, but offer a list of semantically equivalent options.
Option #2 is what is currently implemented. When e.g. a
SELECT
component is dragged from the sidebar, two trees are actually tested for dropping: Nothing but theSELECT
node or theSELECT
node wrapped in the syntheticquerySelect
root node.- Type changes on dragging
- Dragging e.g. a function definition into a statement could be interpeted as “call this function”. This however requires a change of the dragged type. The same happens in
SQL
when dragging a named expression from theSELECT
component: The user probably doesn’t want to insert<expr> as <name>
into theGROUP BY
component, but reference<name>
there.
Dealing with ambiguity¶
Combing the drop strategies mentioned above may result in more then a single operation that could be carried out. It is probably not possible in all cases to resolve every ambiguity automatically, so this requires at least a nice UI.
- A simple but sort of “brutal” version would be to simply show a modal popup with all alternatives. The minimal implementation of this is very straightforward, as the validation process generates trees for all strategies anyway. Quite a lot nicer would be a “diff” of the trees and then only the subsequent display of differences to chose from.
- A possibly nicer version would be to leave “drop ghosts” in the tree: Instead of a single proper node, multiple feint ‘ghost node’ are inserted into the tree. These nodes require one more user interaction (e.g. a click) to actually be manifested into a proper node. This manifestations also removes all other ghosts that could possibly have been inserted, the user has therefor cleared up the ambiguity.
Structural Tree Choices¶
There are often many different ways to describe structurally different but semantically identical trees.