Skip to content

Commit

Permalink
update parser chapter
Browse files Browse the repository at this point in the history
  • Loading branch information
mark-i-m committed Nov 12, 2019
1 parent 75a5f92 commit 934380b
Show file tree
Hide file tree
Showing 4 changed files with 32 additions and 23 deletions.
2 changes: 1 addition & 1 deletion src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@
- [Incremental compilation](./queries/incremental-compilation.md)
- [Incremental compilation In Detail](./queries/incremental-compilation-in-detail.md)
- [Debugging and Testing](./incrcomp-debugging.md)
- [The parser](./the-parser.md)
- [Lexing and Parsing](./the-parser.md)
- [`#[test]` Implementation](./test-implementation.md)
- [Macro expansion](./macro-expansion.md)
- [Name resolution](./name-resolution.md)
Expand Down
2 changes: 1 addition & 1 deletion src/appendix/code-index.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Item | Kind | Short description | Chapter |
`SourceFile` | struct | Part of the `SourceMap`. Maps AST nodes to their source code for a single source file. Was previously called FileMap | [The parser] | [src/libsyntax_pos/lib.rs](https://doc.rust-lang.org/nightly/nightly-rustc/syntax/source_map/struct.SourceFile.html)
`SourceMap` | struct | Maps AST nodes to their source code. It is composed of `SourceFile`s. Was previously called CodeMap | [The parser] | [src/libsyntax/source_map.rs](https://doc.rust-lang.org/nightly/nightly-rustc/syntax/source_map/struct.SourceMap.html)
`Span` | struct | A location in the user's source code, used for error reporting primarily | [Emitting Diagnostics] | [src/libsyntax_pos/span_encoding.rs](https://doc.rust-lang.org/nightly/nightly-rustc/syntax_pos/struct.Span.html)
`StringReader` | struct | This is the lexer used during parsing. It consumes characters from the raw source code being compiled and produces a series of tokens for use by the rest of the parser | [The parser] | [src/libsyntax/parse/lexer/mod.rs](https://doc.rust-lang.org/nightly/nightly-rustc/syntax/parse/lexer/struct.StringReader.html)
`StringReader` | struct | This is the lexer used during parsing. It consumes characters from the raw source code being compiled and produces a series of tokens for use by the rest of the parser | [The parser] | [src/librustc_parse/lexer/mod.rs](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/lexer/struct.StringReader.html)
`syntax::token_stream::TokenStream` | struct | An abstract sequence of tokens, organized into `TokenTree`s | [The parser], [Macro expansion] | [src/libsyntax/tokenstream.rs](https://doc.rust-lang.org/nightly/nightly-rustc/syntax/tokenstream/struct.TokenStream.html)
`TraitDef` | struct | This struct contains a trait's definition with type information | [The `ty` modules] | [src/librustc/ty/trait_def.rs](https://doc.rust-lang.org/nightly/nightly-rustc/rustc/ty/trait_def/struct.TraitDef.html)
`TraitRef` | struct | The combination of a trait and its input types (e.g. `P0: Trait<P1...Pn>`) | [Trait Solving: Goals and Clauses], [Trait Solving: Lowering impls] | [src/librustc/ty/sty.rs](https://doc.rust-lang.org/nightly/nightly-rustc/rustc/ty/struct.TraitRef.html)
Expand Down
3 changes: 3 additions & 0 deletions src/macro-expansion.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# Macro expansion

> `libsyntax`, `librustc_expand`, and `libsyntax_ext` are all undergoing
> refactoring, so some of the links in this chapter may be broken.
Macro expansion happens during parsing. `rustc` has two parsers, in fact: the
normal Rust parser, and the macro parser. During the parsing phase, the normal
Rust parser will set aside the contents of macros and their invocations. Later,
Expand Down
48 changes: 27 additions & 21 deletions src/the-parser.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,35 @@
# The Parser
# Lexing and Parsing

The parser is responsible for converting raw Rust source code into a structured
form which is easier for the compiler to work with, usually called an [*Abstract
Syntax Tree*][ast]. An AST mirrors the structure of a Rust program in memory,
using a `Span` to link a particular AST node back to its source text.
> The parser and lexer are currently undergoing a lot of refactoring, so parts
> of this chapter may be out of date.
The very first thing the compiler does is take the program (in Unicode
characters) and turn it into something the compiler can work with more
conveniently than strings. This happens in two stages: Lexing and Parsing.

The bulk of the parser lives in the [libsyntax] crate.
Lexing takes strings and turns them into streams of tokens. For example,
`a.b + c` would be turned into the tokens `a`, `.`, `b`, `+`, and `c`.
The lexer lives in [`librustc_lexer`][lexer].

Like most parsers, the parsing process is composed of two main steps,
[lexer]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_lexer/index.html

- lexical analysis – turn a stream of characters into a stream of token trees
- parsing – turn the token trees into an AST
Parsing then takes streams of tokens and turns them into a structured
form which is easier for the compiler to work with, usually called an [*Abstract
Syntax Tree*][ast] (AST). An AST mirrors the structure of a Rust program in memory,
using a `Span` to link a particular AST node back to its source text.

The `syntax` crate contains several main players,
The AST is defined in [`libsyntax`][libsyntax], along with some definitions for
tokens and token streams, data structures/traits for mutating ASTs, and shared
definitions for other AST-related parts of the compiler (like the lexer and
macro-expansion).

- a [`SourceMap`] for mapping AST nodes to their source code
- the [ast module] contains types corresponding to each AST node
- a [`StringReader`] for lexing source code into tokens
- the [parser module] and [`Parser`] struct are in charge of actually parsing
tokens into AST nodes,
- and a [visit module] for walking the AST and inspecting or mutating the AST
nodes.
The parser is defined in [`librustc_parse`][librustc_parse], along with a
high-level interface to the lexer and some validation routines that run after
macro expansion. In particular, the [`rustc_parser::parser`][parser] contains
the parser implementation.

The main entrypoint to the parser is via the various `parse_*` functions in the
[parser module]. They let you do things like turn a [`SourceFile`][sourcefile]
[parser][parser]. They let you do things like turn a [`SourceFile`][sourcefile]
(e.g. the source in a single file) into a token stream, create a parser from
the token stream, and then execute the parser to get a `Crate` (the root AST
node).
Expand All @@ -44,14 +50,14 @@ Code for lexical analysis is split between two crates:
specific data structures. Specifically, it adds `Span` information to tokens
returned by `rustc_lexer` and interns identifiers.


[libsyntax]: https://doc.rust-lang.org/nightly/nightly-rustc/syntax/index.html
[rustc_errors]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_errors/index.html
[ast]: https://en.wikipedia.org/wiki/Abstract_syntax_tree
[`SourceMap`]: https://doc.rust-lang.org/nightly/nightly-rustc/syntax/source_map/struct.SourceMap.html
[ast module]: https://doc.rust-lang.org/nightly/nightly-rustc/syntax/ast/index.html
[parser module]: https://doc.rust-lang.org/nightly/nightly-rustc/syntax/parse/index.html
[librustc_parse]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/index.html
[parser]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/index.html
[`Parser`]: https://doc.rust-lang.org/nightly/nightly-rustc/syntax/parse/parser/struct.Parser.html
[`StringReader`]: https://doc.rust-lang.org/nightly/nightly-rustc/syntax/parse/lexer/struct.StringReader.html
[`StringReader`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/lexer/struct.StringReader.html
[visit module]: https://doc.rust-lang.org/nightly/nightly-rustc/syntax/visit/index.html
[sourcefile]: https://doc.rust-lang.org/nightly/nightly-rustc/syntax/source_map/struct.SourceFile.html

0 comments on commit 934380b

Please sign in to comment.