Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify JSX and TSX lexers #1492

Merged
merged 9 commits into from
May 30, 2020

Conversation

pyrmont
Copy link
Contributor

@pyrmont pyrmont commented Apr 9, 2020

The JSX lexer is implemented with an embedded version of itself that is used for lexing interpolated text. This is not the manner in which other lexers handle interpolation and so makes maintenance difficult as patterns developed in other lexers for handling interpolation cannot be brought across. It also makes subclassing the lexer difficult (a practical concern because the TSX lexer subclasses the JSX lexer).

To fix this, this PR rewrites the JSX lexer. By using two interpolation states, it obviates the need for an embedded version.

In the course of implementation, it was discovered that the JavaScript lexer contains a bug in the :statement state. This state contains a rule for lexing braces but this rule causes the :object state to never be closed. The bug was discovered in the implementation of the rewrite because braces are used to begin and end interpolation.

Having a simplified implementation of the JSX lexer makes it straightforward to implement a more robust TSX lexer. The PR includes a version of the TSX lexer that fixes a bug where generics that are expressed in the form <T,> would not be lexed correctly.

This fixes #1517.

@pyrmont pyrmont added the needs-review The PR needs to be reviewed label Apr 9, 2020
@pyrmont pyrmont self-assigned this Apr 9, 2020
@pyrmont
Copy link
Contributor Author

pyrmont commented Apr 9, 2020

@louisrli In the course of trying to fix the TSX lexer, I ended up writing a version that didn't rely on the JSX lexer at all. I thought it was a significant enough improvement that I essentially replaced the JSX lexer with it. It now appears to lex your buggy case properly:

tsx

Let me know if you spot anything amiss!

@pyrmont pyrmont changed the title Implement JSX lexer without embedded JSX lexer Fix JavaScript, TypeScript, JSX and TSX lexers Apr 9, 2020
@pyrmont pyrmont marked this pull request as draft April 9, 2020 19:58
@pyrmont
Copy link
Contributor Author

pyrmont commented Apr 9, 2020

As a note: I'll merge this into master by making separate PRs that are more targeted. This is mainly intended as a proof of concept of how the new architecture would fit together.

@pyrmont pyrmont marked this pull request as ready for review May 30, 2020 18:00
@pyrmont
Copy link
Contributor Author

pyrmont commented May 30, 2020

Looking at this a few weeks after writing it, I don't think it is a problem to merge this as a single commit.

@pyrmont
Copy link
Contributor Author

pyrmont commented May 30, 2020

I've remembered what it was. I thought the changes in #1526 and #1527 should be separate PRs and that the fix for #1363 should be a separate PR. I'll separate that all out and then merge.

@pyrmont pyrmont force-pushed the bugfix.jsx-simplification branch from fbefe54 to ebb9ad1 Compare May 30, 2020 18:30
@pyrmont pyrmont force-pushed the bugfix.jsx-simplification branch from ebb9ad1 to c375d0d Compare May 30, 2020 18:59
@pyrmont pyrmont changed the title Fix JavaScript, TypeScript, JSX and TSX lexers Simplify JSX and TSX lexers May 30, 2020
@pyrmont pyrmont merged commit 3b9c422 into rouge-ruby:master May 30, 2020
@pyrmont pyrmont deleted the bugfix.jsx-simplification branch May 30, 2020 19:27
mattt pushed a commit to NSHipster/rouge that referenced this pull request May 19, 2021
The JSX lexer handles interpolation by running an embedded version of
itself. This is not the manner in which other lexers handle
interpolation and so makes maintenance difficult. It also greatly
complicates subclassing the lexer (a practical concern because the TSX
lexer subclasses the JSX lexer).

To fix this, this commit rewrites the JSX lexer to simplify its
structure. In particular, by specifying `:interpol` and
`:interpol_inner` states, the need for an embedded version of the lexer
is obviated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-review The PR needs to be reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JSX: Opening tag not rendered correctly
1 participant