Describing common syntactic structures

This chapter of the documentation shows how to implement common grammar patterns like lists & repetitions in a way that's most efficient for a LALR(1) parser like Dissect.

List of 1 or more `Foo`s

$this('Foo+')
    ->is('Foo+', 'Foo')
    ->call(function ($list, $foo) {
        $list[] = $foo;

        return $list;
    })

    ->is('Foo')
    ->call(function ($foo) {
        return [$foo];
    });

With some practice, it's very easy to see how this works: when the parser recognizes the first Foo, it reduces it to a single-item array and for each following Foo, it just pushes it onto the array.

Note that Foo+ is just a rule name, it could be equally well called Foos, ListOfFoo or anything else you feel like.

List of 0 or more `Foo`s

$this('Foo*')
    ->is('Foo*', 'Foo')
    ->call(function ($list, $foo) {
        $list[] = $foo;

        return $list;
    })

    ->is(/* empty */)
    ->call(function () {
        return [];
    });

This works pretty much the same like the previous example, the only difference being that we allow Foo* to match nothing.

A comma separated list

The first example of this chapter is trivial to modify to include commas between the Foos. Just change the second line to:

$this('Foo+')
    ->is('Foo+', ',', 'Foo')
...

The second example, however, cannot be modified so easily. We cannot just put a comma in the first alternative:

$this('Foo*')
    ->is('Foo*', ',', 'Foo')
...

since that would allow the list to start with a comma:

, Foo , Foo , Foo

Instead, we say that a "list of zero or more Foos separated by commas" is actually "a list of one or more Foos separated by commas or nothing at all". So our rule now becomes:

$this('Foo*')
    ->is('Foo+')

    ->is(/* empty */)
    ->call(function () {
        return [];
    });

$this('Foo+')
    ->is('Foo+', ',', 'Foo')
...

Expressions

A grammar for very basic mathematical expressions is described in the chapter on parsing. It would require extensive modifications to allow for other operators, function calls, unary operators, ternary operator(s), but there's a lot of grammars for practical programming languages on the internet that you can take inspiration from.

For starters, take a look at this grammar for PHP itself.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

common.md

common.md

Describing common syntactic structures

List of 1 or more `Foo`s

List of 0 or more `Foo`s

A comma separated list

Expressions

Files

common.md

Latest commit

History

common.md

File metadata and controls

Describing common syntactic structures

List of 1 or more Foos

List of 0 or more Foos

A comma separated list

Expressions

List of 1 or more `Foo`s

List of 0 or more `Foo`s