Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ruby API: unfrozen contents of Prism::StringNode nodes #3309

Closed
viralpraxis opened this issue Dec 14, 2024 · 6 comments · Fixed by #3396
Closed

Ruby API: unfrozen contents of Prism::StringNode nodes #3309

viralpraxis opened this issue Dec 14, 2024 · 6 comments · Fixed by #3396
Labels
enhancement New feature or request

Comments

@viralpraxis
Copy link

Prism does not freeze contents of string literals:

Prism.parse('"foo"').value.statements.child_nodes.first.content.frozen?
=> false

Is this intentional? Is it part of the public contract, allowing API consumers to rely on this behavior, or could it potentially change in the future?

Prism version: 1.2.0

@kddnewton
Copy link
Collaborator

It's not really intentional that it's unfrozen, it was just the default. Is there something here you're looking to rely on? What's your use case?

@viralpraxis
Copy link
Author

Well, I was trying to debug an issue in rubocop/rubocop-ast related to mutating a frozen string. It turned out there's some inconsistency between parsers in the aspect of literal strings mutability, so I just wanted to learn more about Prism's approach and any potential contract guarantees. Anyway, it seems a bit odd to me that literals are unfrozen, as I would expect the entire AST to be immutable. Also, see this thread for some details.

Feel free to close this issue.

@eregon
Copy link
Member

eregon commented Dec 16, 2024

I think it would be good to have Strings in the Prism AST frozen (and in general "leaf non-node non-array data", probably most/all of which are already frozen besides strings).
For instances if a string comes from the constant pool it should definitely be frozen (so mutating it doesn't affect other parts of the AST), and whether it comes from the constant pool or not it should be consistent, so frozen.
OTOH freezing the AST nodes seems to me not so necessary.

@kddnewton
Copy link
Collaborator

Everything coming from the constant pool becomes a symbol in the final AST, so that's not something we need to worry about.

I'm okay with an enhancement to freeze the strings, seems fine.

viralpraxis added a commit to viralpraxis/rubocop-performance that referenced this issue Dec 18, 2024
Since sometimes AST string nodes might have their values frozen,
we should call `dup` explicitly to make sure `to_string_literal`
(which does not work with frozen strings) does not raise. Maybe
we should also apply a patch on RuboCop's core side to fix the
`to_string_literal` itself (it now uses `force_encoding` on its argument).

See [1] and [2] for details.

[1] ruby/prism#3309
[2] rubocop/rubocop-ast#342
viralpraxis added a commit to viralpraxis/rubocop that referenced this issue Dec 19, 2024
Currently, there are exactly three instances where the receiver-mutating
`String#force_encoding`` method is used. In all cases except for
`to_string_literal`, a `String#dup` call is included to ensure compatibility
with frozen strings. I couldn't identify any counterexamples in rubocop core that would result
in a runtime error, but there is at least one instance in `rubocop-performance` [1]
where a frozen string might be passed. To ensure everything works correctly,
we need to add a `#dup` call in the implementation of `to_string_literal`.

For additional context, it's worth noting that the Prism parser might start
freezing AST leaf nodes in the future [2],
and whitequark's parser has some unpredictability in this regard [3].

[1] rubocop/rubocop-performance#480
[2] ruby/prism#3309
[3] rubocop/rubocop-ast#342
viralpraxis added a commit to viralpraxis/rubocop that referenced this issue Dec 19, 2024
Currently, there are exactly three instances where the receiver-mutating
`String#force_encoding` method is used. In all cases except for
`to_string_literal`, a `String#dup` call is included to ensure compatibility
with frozen strings. I couldn't identify any counterexamples in rubocop core that would result
in a runtime error, but there is at least one instance in `rubocop-performance` [1]
where a frozen string might be passed. To ensure everything works correctly,
we need to add a `#dup` call in the implementation of `to_string_literal`.

For additional context, it's worth noting that the Prism parser might start
freezing AST leaf nodes in the future [2],
and whitequark's parser has some unpredictability in this regard [3].

[1] rubocop/rubocop-performance#480
[2] ruby/prism#3309
[3] rubocop/rubocop-ast#342
viralpraxis added a commit to viralpraxis/rubocop that referenced this issue Dec 20, 2024
Currently, there are exactly three instances where the receiver-mutating
`String#force_encoding` method is used. In all cases except for
`to_string_literal`, a `String#dup` call is included to ensure compatibility
with frozen strings. I couldn't identify any counterexamples in rubocop core that would result
in a runtime error, but there is at least one instance in `rubocop-performance` [1]
where a frozen string might be passed. To ensure everything works correctly,
we need to add a `#dup` call in the implementation of `to_string_literal`.

For additional context, it's worth noting that the Prism parser might start
freezing AST leaf nodes in the future [2],
and whitequark's parser has some unpredictability in this regard [3].

[1] rubocop/rubocop-performance#480
[2] ruby/prism#3309
[3] rubocop/rubocop-ast#342
viralpraxis added a commit to viralpraxis/rubocop-performance that referenced this issue Dec 21, 2024
Since sometimes AST string nodes might have their values frozen,
we should call `dup` explicitly to make sure `to_string_literal`
(which does not work with frozen strings) does not raise. Maybe
we should also apply a patch on RuboCop's core side to fix the
`to_string_literal` itself (it now uses `force_encoding` on its argument).

See [1] and [2] for details.

[1] ruby/prism#3309
[2] rubocop/rubocop-ast#342
@kddnewton kddnewton added the enhancement New feature or request label Dec 27, 2024
@tenderlove
Copy link
Member

Freezing strings in the AST by default seems like a good idea.

OTOH freezing the AST nodes seems to me not so necessary.

It might be interesting to add an option for doing this. Then Prism could produce an AST which is Ractor friendly. I think the JSON gem has a parse option for freezing all results (Psych definitely does), so maybe we could add the same to Prism.

@kddnewton
Copy link
Collaborator

I've opened a PR for this to make the strings frozen by default. It actually found a couple of things that may have been considered bugs (though you would never run into them in practice). So I'm happy about the change.

Aaron I really like your idea of making it Ractor-friendly, so I can work on an option for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants