Skip to content
This repository has been archived by the owner on Apr 8, 2024. It is now read-only.

[WIP] Implement (possibly?) the entire Rust grammar. #13

Closed
wants to merge 20 commits into from
Closed
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions build.rs
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,22 @@ fn main() {
// Start with the builtin rules for proc-macro grammars.
let mut grammar = gll::proc_macro::builtin();

// HACK(eddyb) inject a custom builtin - this should be upstreamed to gll!
{
use gll::proc_macro::{FlatTokenPat, Pat};

grammar.define(
"LIFETIME",
gll::grammar::eat(Pat(vec![
FlatTokenPat::Punct {
ch: Some('\''),
joint: Some(true),
},
FlatTokenPat::Ident(None),
])),
);
}

// Add in each grammar fragment to the grammar.
for fragment in fragments {
let path = fragment.into_path();
Expand Down
32 changes: 32 additions & 0 deletions grammar/abi.g
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
// FIXME(eddyb) implement more specific literal support in `gll::proc_macro`
Abi = LITERAL?;

/*
// HACK(eddyb) taken from `librustc_target/spec/abi.rs`
Abi =
// Defaults to "C"
{} |

// Platform-specific ABIs
"\"cdecl\"" |
"\"stdcall\"" |
"\"fastcall\"" |
"\"vectorcall\"" |
"\"thiscall\"" |
"\"aapcs\"" |
"\"win64\"" |
"\"sysv64\"" |
"\"ptx-kernel\"" |
"\"msp430-interrupt\"" |
"\"x86-interrupt\"" |
"\"amdgpu-kernel\"" |

// Cross-platform ABIs
"\"Rust\"" |
"\"C\"" |
"\"system\"" |
"\"rust-intrinsic\"" |
"\"rust-call\"" |
"\"platform-intrinsic\"" |
"\"unadjusted\"";
*/
8 changes: 3 additions & 5 deletions grammar/attr.g
Original file line number Diff line number Diff line change
@@ -1,9 +1,7 @@
OuterAttr = "#" attr:Attr;
InnerAttr = "#!" attr:Attr;
InnerAttr = "#" "!" attr:Attr;
Attr = "[" path:Path input:AttrInput "]";
AttrInput =
{} |
"=" LITERAL |
"(" TOKEN_TREE* ")" |
"[" TOKEN_TREE* "]" |
"{" TOKEN_TREE* "}";
"=" TOKEN_TREE |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using token_tree (and macroinput) here seems very broad. The grammar at /~https://github.com/rust-lang-nursery/reference/blob/master/src/attributes.md I think is correct and more specific. I'm not sure what would be a good way to express that literals cannot have suffixes, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ehuss This is intended afaik. rust-lang/rust#55208.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, indeed, I had forgotten about that. Regardless, I think it would be good if the grammar eventually covered the stable syntax.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK it has been stabilized for attribute proc macros, just not built-in ones?
@petrochenkov can you confirm?

As for specific built-in attributes, sure, we could provide grammars for each of them, but it's similar to having a grammar for e.g. format_args!.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Syntax with delimiters is stable and is rejected/gated in non-macro attributes with a semantic check.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PATH = TOKEN_TREE is mostly unstable, I think.
Key-value attributes are not allowed in macros, and gated for non-literals in non-macros, but gating happens after expansion, so something like #[doc = $doc] where $doc: expr is also on stable.

MacroInput;
116 changes: 116 additions & 0 deletions grammar/expr.g
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
Expr = attrs:OuterAttr* kind:ExprKind;
ExprKind =
Literal:LITERAL |
Paren:{ "(" attrs:InnerAttr* expr:Expr ")" } |
Borrow:{ "&" mutable:"mut"? expr:Expr } |
Box:{ "box" expr:Expr } |
Unary:{ op:UnaryOp expr:Expr } |
Try:{ expr:Expr "?" } |
Binary:{ left:Expr op:BinaryOp right:Expr } |
Assign:{ left:Expr { "=" | op:BinaryAssignOp } right:Expr } |
Range:{ start:Expr? ".." end:Expr? } |
RangeInclusive:{ start:Expr? "..=" end:Expr } |
Type:{ expr:Expr ":" ty:Type } |
Cast:{ expr:Expr "as" ty:Type } |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type without bounds.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type:{ expr:Expr ":" ty:Type } | is also unstable (so leave a note).

Field:{ base:Expr "." field:FieldName } |
Index:{ base:Expr "[" index:Expr "]" } |
Array:{ "[" attrs:InnerAttr* exprs:Expr* % "," ","? "]" } |
Repeat:{ "[" attrs:InnerAttr* elem:Expr ";" count:Expr "]" } |
Tuple:{ "(" attrs:InnerAttr* exprs:Expr* % "," ","? ")" } |
Copy link

@petrochenkov petrochenkov Dec 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(EXPR) is accepted as a tuple here (introducing false ambiguity).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Same with types and patterns.)

Path:QPath |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me there needs to be more kinds of paths. For example, here it seems to allow foo<T> which needs the :: to parse correctly (if I'm reading it right).

Call:{ callee:Expr "(" args:Expr* % "," ","? ")" } |
MethodCall:{ receiver:Expr "." method:PathSegment "(" args:Expr* % "," ","? ")" } |
Struct:{ path:Path "{" attrs:InnerAttr* fields:StructExprFieldsAndBase "}" } |
Block:{ { label:Label ":" }? unsafety:"unsafe"? block:Block } |
AsyncBlock:{ "async" block:Block } |
Centril marked this conversation as resolved.
Show resolved Hide resolved
TryBlock:{ "try" block:Block } |
Continue:{ "continue" label:Label? } |
Break:{ "break" label:Label? value:Expr? } |
Return:{ "return" value:Expr? } |
Yield:{ "yield" value:Expr? } |
If:If |
Match:{ "match" expr:Expr "{" attrs:InnerAttr* arms:MatchArm* "}" } |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this Expr needs to exclude struct expressions.

Loop:{ { label:Label ":" }? "loop" body:Block } |
While:{ { label:Label ":" }? "while" cond:Cond body:Block } |
For:{ { label:Label ":" }? "for" pat:Pat "in" expr:Expr body:Block } |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this Expr needs to exclude struct expressions.

Closure:{
generator_static:"static"? asyncness:"async"? by_val:"move"?
Centril marked this conversation as resolved.
Show resolved Hide resolved
"|" args:ClosureArg* % "," ","? "|" { "->" ret_ty:Type }? body:Expr
} |
MacroCall:MacroCall;

UnaryOp =
Not:"!" |
Neg:"-" |
Deref:"*";

BinaryOp =
// Arithmetic & bitwise (these also have `BinaryAssignOp` forms)
Add:"+" |
Sub:"-" |
Mul:"*" |
Div:"/" |
Rem:"%" |
BitXor:"^" |
BitAnd:"&" |
BitOr:"|" |
Shl:"<<" |
Shr:">>" |

// Logic (short-circuiting)
LogicAnd:"&&" |
LogicOr:"||" |

// Comparisons
Eq:"==" |
Lt:"<" |
Le:"<=" |
Ne:"!=" |
Gt:">" |
Ge:">=";

// FIXME(eddyb) figure out how to deduplicate this with `BinaryOp`
// The problem is something like `BinaryOp "="` allows a space in between,
// as the last token inside each `BinaryOp` case does not require being
// joint, but each token before the `=` here does impose that requirement.
BinaryAssignOp =
Add:"+=" |
Sub:"-=" |
Mul:"*=" |
Div:"/=" |
Rem:"%=" |
BitXor:"^=" |
BitAnd:"&=" |
BitOr:"|=" |
Shl:"<<=" |
Shr:">>=";

FieldName =
Ident:IDENT |
// FIXME(eddyb) restrict this to only integer literals
Numeric:LITERAL;

// FIXME(eddyb) find a way to express this `A* B?` pattern better
StructExprFieldsAndBase =
Fields:{ fields:StructExprField* % "," ","? } |
Base:{ ".." base:Expr } |
FieldsAndBase:{ fields:StructExprField+ % "," "," ".." base:Expr };

StructExprField = attrs:OuterAttr* kind:StructExprFieldKind;
StructExprFieldKind =
Shorthand:IDENT |
Explicit:{ field:FieldName ":" expr:Expr };

Label = LIFETIME;

If = "if" cond:Cond then:Block { "else" else_expr:ElseExpr }?;
Cond =
Bool:Expr |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this (and the expr in "let" below) needs to exclude struct expressions.

Let:{ "let" pat:Pat "=" expr:Expr };

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if let PAT | PAT = EXPR { ... } is not covered.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might've been accepted/implemented after I've already opened this PR, heh.

ElseExpr =
Block:Block |
If:If;

MatchArm = attrs:OuterAttr* "|"? pats:Pat+ % "|" { "if" guard:Expr }? "=>" body:Expr ","?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to express the , separator completely. If there are mutliple arms, then , is required if it is a non-block expression for the first N-1 arms. MatchArms

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I gave up on that when I realized Block is not the only case nowadays.


ClosureArg = pat:Pat { ":" ty:Type }?;
35 changes: 35 additions & 0 deletions grammar/generics.g
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
Generics = "<" params:GenericParam* % "," ","? ">";
GenericParam = attrs:OuterAttr* kind:GenericParamKind;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lifetimes can't be listed after types.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's more of a semantic restriction, to be really honest.
The introduction of const generics has made the old rules seem quite silly other thsn from a stylistic point of view. Most of the compiler has been fixed to not assume lifetimes come first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it actually were a semantic restriction then this would not be an error:

macro_rules! foo {
    ($x: item) => {}
}

foo! {
    fn stuff<T, 'a>() {}
}

(Semantic vs. syntactic restrictions have semver consequences due to macros...)

And yeah... grammar wise this is super annoying, especially since const generic variables can be interspersed with type variables (afaik, or we haven't made any decision here...).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we currently have the check duplicated (since the AST type loses the ordering requirement). @petrochenkov knows more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know anything anymore since @varkor rewrote everything for const generics.
Before that different kinds of parameters were stored separately in AST and the ordering was enforced solely in the parser.

GenericParamKind =
Lifetime:{ name:LIFETIME { ":" bounds:LifetimeBound* % "+" "+"? }? } |
Type:{ name:IDENT { ":" bounds:TypeBound* % "+" "+"? }? { "=" default:Type }? };

ForAllBinder = "for" generics:Generics;

WhereClause = "where" bounds:WhereBound* % "," ","?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows a bare , which I don't think is valid.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, it's not valid! Hmm and neither is (,). I think I need to make a general note about this.

WhereBound =
Lifetime:{ lt:LIFETIME ":" bounds:LifetimeBound* % "+" "+"? } |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this allows a bare + which is not allowed?

Type:{ binder:ForAllBinder? ty:Type ":" bounds:TypeBound* % "+" "+"? } |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disturbing fact: the following is valid Rust syntax:

trait X {}

fn foo<T>()
where for<'a> for<'b> X: Clone
{}

however, syntactically these for's are different: the first one is a part of where predicate, the second one is a part of the type. Our ambiguity testing should ideally expose this problem.

Copy link
Member Author

@eddyb eddyb Nov 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yupp, we basically have a precedence rule that eagerly takes the first encountered for<'a> and uses it for the whole T: A + B +....

Even more fun is the optional but not nestable parens around a bound.
where i32: (for<> Copy) looks weird but is perfectly valid today!

@Centril suggested we should make it a proper expression grammar with parens and + as a binop between bounds, which I feel pretty sympathetic towards.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even more fun is the optional but not nestable parens around a bound.

Sorry, I tried to prevent that, but lang team forced the decision.

we should make it a proper expression grammar with parens and + as a binop between bounds

Why?
Bound + Bound + Bound + ... is a list with separator, not multiple binary operators, it could use Bound, Bound, Bound, ... or something equally well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Well, it would explain the parens. And they do feel maybe like entities that can be arbitrarily composed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this allows a bare + which I don't think is allowed.

TypeEq:{ binder:ForAllBinder? left:Type { "=" | "==" } right:Type };
Centril marked this conversation as resolved.
Show resolved Hide resolved

LifetimeBound = outlives:LIFETIME;
TypeBound =
Outlives:LIFETIME |
Trait:TypeTraitBound |
TraitParen:{ "(" bound:TypeTraitBound ")" };
TypeTraitBound = unbound:"?"? binder:ForAllBinder? path:Path;

GenericArgs =
AngleBracket:{ "<" args_and_bindings:AngleBracketGenericArgsAndBindings? ","? ">" } |
Paren:{ "(" inputs:Type* % "," ","? ")" { "->" output:Type }? };

// FIXME(eddyb) find a way to express this `A* B*` pattern better
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, this is tricky. I ended up listing each permutation which is a bit yuck. GenericArgs

AngleBracketGenericArgsAndBindings =
Args:GenericArg+ % "," |
Bindings:TypeBinding+ % "," |
ArgsAndBindings:{ args:GenericArg+ % "," "," bindings:TypeBinding+ % "," };

GenericArg =
Lifetime:LIFETIME |
Type:Type;
TypeBinding = name:IDENT "=" ty:Type;
95 changes: 90 additions & 5 deletions grammar/item.g
Original file line number Diff line number Diff line change
@@ -1,7 +1,92 @@
ModuleContents = attrs:InnerAttr* items:ItemWithOuterAttr*;
ModuleContents = attrs:InnerAttr* items:Item*;

ItemWithOuterAttr = attrs:OuterAttr* item:Item;
// TODO other items
Item =
Item = attrs:OuterAttr* vis:Vis? kind:ItemKind;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

macro_rules is missing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As explained elsewhere, not special.

ItemKind =
Use:{ "use" use_tree:UseTree ";" } |
ExternCrate:{ "extern" "crate" name:IDENT { "as" rename:IDENT }? ";" } |
Use:{ "use" path:Path { "as" rename:IDENT }? ";" }; // TODO use trees
Mod:{ "mod" name:IDENT { ";" | "{" contents:ModuleContents "}" } } |
ForeignMod:{ "extern" abi:Abi "{" attrs:InnerAttr* items:ForeignItem* "}" } |
Static:{ "static" mutable:"mut"? name:IDENT ":" ty:Type "=" value:Expr ";" } |
Const:{ "const" name:IDENT ":" ty:Type "=" value:Expr ";" } |
Fn:{ header:FnHeader "fn" decl:FnDecl body:Block } |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably exclude variadics.

TypeAlias:{ "type" name:IDENT generics:Generics? where_clause:WhereClause? "=" ty:Type ";" } |
ExistentialType:{ "existential" "type" name:IDENT generics:Generics? where_clause:WhereClause? ":" bounds:TypeBound* % "+" "+"? ";" } |
Centril marked this conversation as resolved.
Show resolved Hide resolved
Enum:{ "enum" name:IDENT generics:Generics? where_clause:WhereClause? "{" variants:EnumVariant* % "," ","? "}" } |
Struct:{ "struct" name:IDENT generics:Generics? body:StructBody } |
Union:{ "union" name:IDENT generics:Generics? where_clause:WhereClause? "{" fields:RecordField* % "," ","? "}" } |
Trait:{
unsafety:"unsafe"? auto:"auto"? "trait" name:IDENT generics:Generics?
Centril marked this conversation as resolved.
Show resolved Hide resolved
{ ":" superbounds:TypeBound* % "+" "+"? }?
where_clause:WhereClause? "{" trait_items:TraitItem* "}"
} |
TraitAlias:{
Centril marked this conversation as resolved.
Show resolved Hide resolved
"trait" name:IDENT generics:Generics?
{ ":" superbounds:TypeBound* % "+" "+"? }?
where_clause:WhereClause? "=" bounds:TypeBound* % "+" "+"? ";"
} |
Impl:{
defaultness:"default"? unsafety:"unsafe"? "impl" generics:Generics?
Centril marked this conversation as resolved.
Show resolved Hide resolved
{ negate:"!"? trait_path:Path "for" }? ty:Type
where_clause:WhereClause? "{" attrs:InnerAttr* impl_items:ImplItem* "}"
} |
Macro:{ "macro" name:IDENT { "(" TOKEN_TREE* ")" }? "{" TOKEN_TREE* "}" } |
Centril marked this conversation as resolved.
Show resolved Hide resolved
MacroCall:ItemMacroCall;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Macros cannot have visibility.


UseTree =
Glob:{ prefix:UseTreePrefix? "*" } |
Nested:{ prefix:UseTreePrefix? "{" children:UseTree* % "," ","? "}" } |
Simple:{ path:Path { "as" rename:IDENT }? };
UseTreePrefix =
Path:{ path:Path "::" } |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would need to be a mod-style path if you add more path types.

Global:"::";

ForeignItem = attrs:OuterAttr* vis:Vis? kind:ForeignItemKind;
ForeignItemKind =
Static:{ "static" mutable:"mut"? name:IDENT ":" ty:Type ";" } |
Fn:{ "fn" decl:FnDecl ";" } |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems overly broad. Arguments must be named with a type, this allows anonymous arguments.

Type:{ "type" name:IDENT ";" } |
Centril marked this conversation as resolved.
Show resolved Hide resolved
MacroCall:ItemMacroCall;
Centril marked this conversation as resolved.
Show resolved Hide resolved

TraitItem = attrs:OuterAttr* kind:TraitItemKind;
TraitItemKind =
Const:{ "const" name:IDENT ":" ty:Type { "=" default:Expr }? ";" } |
Fn:{ header:FnHeader "fn" decl:FnDecl { default_body:Block | ";" } } |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably exclude variadics.
Also, 2018 doesn't allow anonymous parameter names.
2015 note: Has strict limitations on the pattern types allowed.

Type:{ "type" name:IDENT generics:Generics? { ":" bounds:TypeBound* % "+" "+"? }? where_clause:WhereClause? { "=" default:Type }? ";" } |
MacroCall:ItemMacroCall;

ImplItem = attrs:OuterAttr* defaultness:"default"? vis:Vis? kind:ImplItemKind;
Centril marked this conversation as resolved.
Show resolved Hide resolved
ImplItemKind =
Const:{ "const" name:IDENT ":" ty:Type "=" value:Expr ";" } |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A separate rule could be made for this so this long sequence isn't duplicated with the const item.

Fn:{ header:FnHeader "fn" decl:FnDecl body:Block } |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No variadics.

Type:{ "type" name:IDENT generics:Generics? where_clause:WhereClause? "=" ty:Type ";" } |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same note about sharing with TypeAlias.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generics are GAT and therefore unstable, unlike in type aliases.

ExistentialType:{ "existential" "type" name:IDENT generics:Generics? where_clause:WhereClause? ":" bounds:TypeBound* % "+" "+"? ";" } |
Centril marked this conversation as resolved.
Show resolved Hide resolved
MacroCall:ItemMacroCall;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No vis on macro.


FnHeader = constness:"const"? unsafety:"unsafe"? asyncness:"async"? { "extern" abi:Abi }?;
FnDecl = name:IDENT generics:Generics? "(" args:FnArgs? ","? ")" { "->" ret_ty:Type }? where_clause:WhereClause?;

// FIXME(eddyb) find a way to express this `A* B?` pattern better
FnArgs =
Regular:FnArg+ % "," |
Variadic:"..." |
RegularAndVariadic:{ args:FnArg+ % "," "," "..." };
FnArg =
SelfValue:{ mutable:"mut"? "self" } |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This misses type ascription I think (which is stable in limited form)

struct S;
impl S {
    fn f(self: S) {}
}
fn main() {}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that that parses, but self there is just a binding pattern.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Centril do we yet have a proposal for "self" "@" Pat { ":" Type }?, or were there previous attempts that died out? IMO it would be a natural extension.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eddyb that would just fall out from having self be a valid pattern + intersection patterns; i.e.

pat ::= "self" | pat "@" pat | pat ":" type | ... ;

I've mentioned the possibility of making self a valid pattern in /~https://github.com/Centril/rfcs/blob/rfc/generalized-type-ascription/text/0000-generalized-type-ascription.md#self-as-a-pattern but it isn't part of the proposal right now.

SelfRef:{ "&" lt:LIFETIME? mutable:"mut"? "self" } |
Regular:FnSigInput;

EnumVariant = attrs:OuterAttr* name:IDENT kind:EnumVariantKind { "=" discr:Expr }?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the discriminant cannot appear next to tuple or record-style variants.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should put a FIXME there. I think I hit some strange bug when I tried "doing it right".

EnumVariantKind =
Unit:{} |
Tuple:{ "(" fields:TupleField* % "," ","? ")" } |
Record:{ "{" fields:RecordField* % "," ","? "}" };

// FIXME(eddyb) could maybe be shared more with `EnumVariantKind`?
// The problem is the semicolons for `Unit` and `Tuple`, and the where clauses.
StructBody =
Unit:{ where_clause:WhereClause? ";" } |
Tuple:{ "(" fields:TupleField* % "," ","? ")" where_clause:WhereClause? ";" } |
Record:{ where_clause:WhereClause? "{" fields:RecordField* % "," ","? "}" };

TupleField = attrs:OuterAttr* vis:Vis? ty:Type;
RecordField = attrs:OuterAttr* vis:Vis? name:IDENT ":" ty:Type;
13 changes: 13 additions & 0 deletions grammar/macro.g
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
MacroCall = path:Path "!" ident_input:IDENT? input:MacroInput;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I was studying the parser code, this IDENT here had me stumped. Is this some old vestigial thing? I can't find a way to get it to actually parse, though.

Also, the type of path depends on where it is used. For the reference grammar, I just said it was a simple ("mod"-style) path, because in practice that's the only way it can be used. But from a parsing standpoint, I guess it could be more accurate. For example, foo<T>!{} does not parse as an expression, but it does parse as a type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

macro_rules! name {
    // ....
}

It depends on whether macro_rules is considered a MacroCall, I suppose. As far as I'm aware, only macro_rules! has this syntax.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, interesting.

What I was thinking of is that foo!bar{} is considered a valid statement in libsyntax (/~https://github.com/rust-lang/rust/blob/65e485d8f1d28102b426c9d6d82f835cd6470d3e/src/libsyntax/parse/parser.rs#L4711-L4714). I assumed that was what it was referring to. macro_rules is parsed through a separate mechanism (eat_macro_def). This "special identifier" was added 6 years ago, and I couldn't tell from the commit what it means. All other instances of parsing macro invocations do not check for that identifier.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parsing macro_rules is not special, only when expanding it, does it allow the extra identifier.
A "macro definition" is the new macro keyword

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm understanding it a little better. However, macro calls are parsed differently in various situations, and some don't allow the extra ident (type, pattern, trait/impl item, foreign item). Eventually these differences will need to be captured in the grammar, right?

Also, I think the syntax for macro_rules should be captured somewhere, since it is useful for certain rust tooling, syntax highlighting, etc. It is now in the reference, but I'm hoping the wg-grammar and the reference can be closely tied if not identical. Maybe the reference will just have extra stuff beyond the wg-grammar?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think the syntax for macro_rules should be captured somewhere

We could do this, especially since macro definitions are not just macro invocations, and we could even have a converter from macro_rules input patterns to CFGs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually these differences will need to be captured in the grammar, right?

Ideally we would find a way to keep adding rules without making the grammar unreadable and without losing the ability to encode into (E)BNF (automatically).

MacroInput =
"(" TOKEN_TREE* ")" |
"[" TOKEN_TREE* "]" |
"{" TOKEN_TREE* "}";

// FIXME(eddyb) could maybe be shared more with `MacroInput`?
// The problem is the semicolons for the `()` and `[]` cases.
ItemMacroCall = path:Path "!" ident_input:IDENT? input:ItemMacroInput;
ItemMacroInput =
"(" TOKEN_TREE* ")" ";" |
"[" TOKEN_TREE* "]" ";" |
"{" TOKEN_TREE* "}";
39 changes: 39 additions & 0 deletions grammar/pat.g
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
Pat =
Wild:"_" |
Literal:{ minus:"-"? lit:LITERAL } |
Range:{ start:PatRangeValue ".." end:PatRangeValue } |
Centril marked this conversation as resolved.
Show resolved Hide resolved
RangeInclusive:{ start:PatRangeValue { "..." | "..=" } end:PatRangeValue } |
Binding:{ binding:Binding { "@" subpat:Pat }? } |
Paren:{ "(" pat:Pat ")" } |
Ref:{ "&" mutable:"mut"? pat:Pat } |
Box:{ "box" pat:Pat } |
Centril marked this conversation as resolved.
Show resolved Hide resolved
Slice:{ "[" elems:SlicePatElem* % "," ","? "]" } |
Tuple:{ "(" fields:TuplePatField* % "," ","? ")" } |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does gll match in order? Like here, does (a) always match the Paren rule? In isolation, reading this rule doesn't make it clear that with a single element, a trailing comma is required.

Also, trailing , can't follow ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GLL matches every possible option. This initial draft of the grammar lacks any disambiguation.

Path:QPath |
TupleStruct:{ path:Path "(" fields:TuplePatField* % "," ","? ")" } |
Struct:{ path:Path "{" fields:StructPatFieldsAndEllipsis "}" } |
MacroCall:MacroCall;

PatRangeValue =
Literal:{ minus:"-"? lit:LITERAL } |
Path:QPath;

Binding = boxed:"box"? reference:"ref"? mutable:"mut"? name:IDENT;

SlicePatElem =
Subslice:{ subpat:Pat? ".." } |
Pat:Pat;

TuplePatField =
Ellipsis:".." |
Pat:Pat;

// FIXME(eddyb) find a way to express this `A* B?` pattern better
StructPatFieldsAndEllipsis =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Individual fields (including ..) can have outer attributes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is being tested! Can you remove it from the official parser and show that it breaks no tests? And then add tests. I can also do it if you want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it is covered by src/test/ui/structs/struct-field-cfg.rs.

Fields:{ fields:StructPatField* % "," ","? } |
Ellipsis:{ ".." } |
FieldsAndEllipsis:{ fields:StructPatField+ % "," "," ".." };

StructPatField =
Shorthand:Binding |
Explicit:{ field:FieldName ":" pat:Pat };
10 changes: 8 additions & 2 deletions grammar/path.g
Original file line number Diff line number Diff line change
@@ -1,2 +1,8 @@
Path = global:"::"? segments:PathSegment* % "::";
PathSegment = ident:IDENT; // TODO generics
Path = global:"::"? path:RelativePath;
RelativePath = segments:PathSegment+ % "::";
PathSegment = ident:IDENT { "::"? args:GenericArgs }?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would need $crate if you want to cover macros.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should parse as IDENT (it's... complicated. ask @petrochenkov).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should write this down somewhere less ephemeral ;)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Less ephemeral - rust-lang/rust#55640.


QSelf = "<" ty:Type { "as" trait_path:Path }? ">";
QPath =
Unqualified:Path |
Qualified:{ qself:QSelf "::" path:RelativePath };
7 changes: 7 additions & 0 deletions grammar/stmt.g
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Stmt =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be useful to have a separate entry for macro call here. It would need to be like ItemMacroCall, and maybe call it something different (since it is not necessarily an item). foo!(); can either be an item or an expression statement, and currently this grammar only includes it as an item.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also included as an expression, though?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, yea, assuming expression is fixed to deal with ;. I'm maybe a little confused because parse_stmt_without_recovery explicitly handles macros and does not defer to item or expr to handle it. (i.e., why does StmtKind have a separate Mac variant?).

Item:Item |
Local:{ attrs:OuterAttr* "let" pat:Pat { ":" ty:Type }? { "=" init:Expr }? ";" } |
Expr:Expr |
Semi:";";
Copy link

@petrochenkov petrochenkov Dec 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If nonterminals in this grammar are supposed to match fragments in macro_rules, then Stmt needs to be organized slightly differently, more like

Stmt =
    Item:Item |
    Local:{ attrs:OuterAttr* "let" pat:Pat { ":" ty:Type }? { "=" init:Expr }? } |
    Expr:Expr;

StmtX =
    Item |
    Local ";" |
    Expr |
    ";";
 Block = "{" attrs:InnerAttr* StmtX* "}";


Block = "{" attrs:InnerAttr* Stmt* "}";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to handle ; separator between expressions? It's rather tricky to express, but I think in the reference grammar it is relatively succinct.

Loading