Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to configure parsers in packwerk #243

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

dnamsons
Copy link

What are you trying to accomplish?

As discussed on #142, this change allows for additional parsers to be added.

This also allows extracting the erb parser out of packwerk(which might be desired - #142 (comment)).

What approach did you choose and why?

  • modified the Packwerk::Parsers::Factory class to store parsers in an instance variable and made this variable modifiable via the use of an attr_accessor.
  • moved the regex constants into their respective parser classes and added a new method .match? to each parser class which verifies if a given path is relevant to the parser.
  • changed Packwerk::Parsers::Factory#for_path to iterate through the defined parsers to find a parser whose match? method returns true.

I considered multiple different alternatives to how new parsers can be defined(and the existing ones removed) but could not land on a satisfactory solution, so I stuck with the simplest one in hopes I could get some advice on how best to proceed.

Currently every parser is initialized anew for every new file that matches with its regex. I considered either storing already initialized parser instances in the parsers instance variable or, alternatively, creating a sort of a "cache" of parser instances but wasn't sure on which would be preferable.

What should reviewers focus on?

As mentioned above, my current approach to the the configuration of parsers is simplistic, I would appreciate any input on how to make this be more in line with the general structure of packwerk.

Type of Change

  • Bugfix
  • New feature
  • Non-breaking change (a change that doesn't alter functionality - i.e., code refactor, configs, etc.)

Additional Release Notes

  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • It is safe to rollback this change.

@dnamsons dnamsons requested a review from a team as a code owner October 26, 2022 14:28
@alexevanczuk
Copy link
Contributor

alexevanczuk commented Oct 27, 2022

@dnamsons Thank you for this PR! One thing @gmcgibbon and I talked about that we were hoping to do before merging this one is to work on a consistent experience in packwerk for loading in extensions, since we will soon have the ability to extend not just parsers, but checkers and formatters too. With that in mind, we are hoping to put up PRs to allow packwerk.yml to have a require directive similar to rubocop (https://docs.rubocop.org/rubocop/extensions.html#loading-extensions) and then use a standard interface for adding extensions to packwerk so the client has a consistent experience.

Would you mind if we waited a bit on us reviewing + merging this until we can add that? That way this could be released on top of the new extension platform.

@dnamsons
Copy link
Author

Hey @alexevanczuk, that's quite all right!

I'd love to get involved if there is anything I can do to help with this effort.

end
end

Packwerk::Parsers::Factory.instance.parsers.append(SlimParser)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of asking the user to explicitly specify these parsers, can we have them included automatically?

Here's what I had in mind:

# typed: strict
# frozen_string_literal: true

module Packwerk
  module Parsers
    module ParserInterface
      extend T::Helpers
      extend T::Sig

      requires_ancestor { Kernel }

      interface!

      sig { params(base: Class).void }
      def self.included(base)
        @parsers ||= T.let(@parsers, T.nilable(T::Array[Class]))
        @parsers ||= []
        @parsers << base
      end

      sig { returns(T::Array[ParserInterface]) }
      def self.all
        T.unsafe(@parsers).map(&:new)
      end

      sig { abstract.params(io: File, file_path: String).returns(T.untyped) }
      def call(io:, file_path:)
      end
    end
  end
end

Then, the user just needs to define this in a file that is "required" within packwerk.yml via this new API in this upcoming PR: #245

Elsewhere, we can just check Packwerk::Parsers::ParserInterface.all to get the list of all of the parsers.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is certainly a better approach than what I had created, as it makes adding new parsers easier 👍

The only concern i have is that the method Packwerk::Parsers::ParserInterface.all seems like it's not cohesive to the idea of a parser interface, to me it would seem more appropriate if we were to call Packwerk::Parsers.all and be able to get all of the registered parsers. What do you think about something like this?

# lib/packwerk/parsers/parser_interface.rb
module Packwerk
  module Parsers
    module ParserInterface
      extend T::Helpers
      extend T::Sig

      requires_ancestor { Kernel }

      interface!

      sig { params(base: Class).void }
      def self.included(base)
        Packwerk::Parsers.register(base)
      end

      sig { abstract.params(io: File, file_path: String).returns(T.untyped) }
      def call(io:, file_path:)
      end

      sig { abstract.params(path: String).returns(T.nilable(ParserInterface)) }
      def match?(path)
      end
    end
  end
end
# lib/packwerk/parsers.rb
module Packwerk
  module Parsers
    autoload :Erb, "packwerk/parsers/erb"
    autoload :Factory, "packwerk/parsers/factory"
    autoload :ParserInterface, "packwerk/parsers/parser_interface"
    autoload :Ruby, "packwerk/parsers/ruby"

    sig { params(parser: ParserInterface).void }
    def self.register(parser)
      @parsers ||= T.let([], T.nilable(T::Array[ParserInterface]))
      
      @parsers << parser
    end

    sig { returns(T::Array[ParserInterface]) }
    def self.all
      T.unsafe(@parsers).map(&:new)
    end
  end
end

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't imagine the ability to get all parsers to be a public API. I do think that's a cleaner internal API for packwerk though! However I'm not sure we'll want to do this for some of the other things we are able to or will be able to extend (offenses formatters, checkers) since there isn't an existing module for them (i.e. no standalone Formatters or Checkers modules), and I'm not sure if we'll want to add one just to hold the registered interfaces.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention wasn't to make the listing of parsers as a public API, just to make it a bit cleaner for internal use, but I can see that adding modules for Formatters and Checkers just to hold extensions might be a bit excessive.

I'll change my Pull request to use the solution that You outlined, so we don't have differing APIs for how Parsers get stored vs. Formatters & Checkers.

@alexevanczuk
Copy link
Contributor

Thanks @dnamsons ! Let me know what you think of the require API I referenced in my comment within this PR. Also, let me know what you think of the API to actually perform the extension. Also, I think you'll need to sign Shopify's CLA.

@dnamsons
Copy link
Author

@alexevanczuk Having looked at the referenced PR, the require API implementation looks good to me 🎉

As for the CLA, I signed it right after creating the PR.

@alexevanczuk
Copy link
Contributor

@alexevanczuk Having looked at the referenced PR, the require API implementation looks good to me 🎉

As for the CLA, I signed it right after creating the PR.

Great!

@gmcgibbon Any thoughts on why the CLA check continues to fail? (/~https://github.com/Shopify/packwerk/actions/runs/3329890441/jobs/5543985448)

@alexevanczuk
Copy link
Contributor

@dnamsons I just merged #245. When you get a chance, could you update this PR with docs referencing the require directive as the way for the user to load their parser extension?


```ruby
class SlimParser
include ParserInterface
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dnamsons In some of the other merged PRs, we ended up just going with the plain Checker and Validator as names for the interfaces to be included to extend packwerk. Can we just call this Parser, i.e. include Packwerk::Parser

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense 👍
the Ruby parser also defines the Parser namespace, to avoid confusion, I added Packerk:: to every usage of this interface.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should rename to something like FileParser?

@alexevanczuk
Copy link
Contributor

Hey @dnamsons Are you still hoping to merge this? Let me know if I can help with anything!

@dnamsons
Copy link
Author

Hey @alexevanczuk, sorry, didn't have much free time lately 😞

I've applied your suggestions, now I just have to update the documentation to match the changes 🎉

@@ -99,7 +121,7 @@ class CustomParser < Packwerk::Parsers::Erb
end
end

Packwerk::Parsers::Factory.instance.erb_parser_class = CustomParser
Packwerk::Parsers::Factory.instance.parsers = [Packwerk::Parsers::Ruby, CustomParser]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexevanczuk After these changes, I don't see an easy way to override the erb parser.

IMO the best approach would be to move the erb parser to a seperate gem, but if we want to keep it in packwerk and have a way of configuring it, then I will add a configuration for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you share more about why you want to override the ERB parser? I know we want to add new parsers, but why override the other one?

Separately, I think if we want to extract ERB, /~https://github.com/rubyatscale/packwerk-extensions would be a great home for it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexevanczuk I don't have a particular use case, but the API for overriding the erb parser was there before my changes(/~https://github.com/Shopify/packwerk/blob/main/USAGE.md#using-a-custom-erb-parser) so I was thinking about how to keep feature parity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting... I see a couple of options:

  1. We could keep the existing API exactly as is and implement it by having an override when getting the parser to use (the override would check if the parser to fetch is ERB and use the overridden value if provided). The advantages here are we don't have to think about changing the API at all.
  2. We could eliminate this API entirely and when someone wants to custom parse an ERB, they could just pass in their own custom ERB parser that parsers out things the default parser cannot. Advantage here is that it's one less "extension" API to support.

What do you think @dnamsons and @gmcgibbon?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dnamsons I took another look at this. What if we changed this line:
/~https://github.com/Shopify/packwerk/pull/243/files#diff-79f5d7b9ceb194cd2ea7989b1a3839f0c8f8d14cdcd2ff7dfe02299498a82648R14

To select instead of find the parsers, which will allow multiple parsers to parse the same file? That way, someone can "extend" how any type of file is being parsed (ruby, erb, etc.)?

Then we can remove this API and its behavior can be ported to an ERB parser extension.

@@ -56,7 +56,7 @@ def call(absolute_file)
private

sig do
params(node: Parser::AST::Node, absolute_file: String).returns(T::Array[UnresolvedReference])
params(node: ::Parser::AST::Node, absolute_file: String).returns(T::Array[UnresolvedReference])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing the prefixed :: definitely makes me feel inclined to prefer Packwerk::FileParser over Parser so that we don't fight with the parser library. I suppose it's a "temporary" problem if we eventually swap out parser for syntax_tree or something else, but can't go wrong with FileParser I think.

def erb_parser_class=(klass)
@erb_parser_class = klass
@erb_parser = nil
Packwerk::Parser.all.find { |parser| parser.match?(path: path) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if someday we'll want to allow multiple parsers to parse the same file if they do different things (e.g. a ruby file parser that takes a special DSL and turns it into a constant reference, although, like relations, I think that is currently done in the references_from_ast step).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we probably don't need to change this now since the current behavior only allows one parser for a file!

@alexevanczuk
Copy link
Contributor

Hey @dnamsons I think we're almost ready for a final review on this. I'd love to include in packwerk 3.0 if we can shore this one up!

Here's what's left, as I understand it:

  • Remove references to Packwerk::Parsers::Factory.erb_parser_class= and instead allow multiple parsers to parse the same file and update docs to include references to the new API for adding a new ERB parser class
  • Rename lib/packwerk/parser.rb to lib/packwerk/file_parser.rb to address name clashing

@athieriot
Copy link

Hello there !

Is it possible to know if there is any news on the possibility of merging this ?

@alxckn
Copy link
Contributor

alxckn commented Sep 4, 2023

@dnamsons @alexevanczuk
Do you think it's still possible to get this PR in shape and merged? I would be interested in implementing the latest requested changes to make it happen

@richardmarbach
Copy link

Hey @dnamsons, I'm interested in adding Haml support and would use this parser extension. If you don't have time to finish it, I'd be willing to complete the latest requested changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants