-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-117482: Cleaning up tp_dict for static builtin types in subinterpreters #117660
Conversation
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
Thanks for the detailed analysis and solution! It makes sense. I'll take a closer look at the PR in the next day or two. |
Thank you for taking the time to look into this! |
Hi @ericsnowcurrently, just checking in on the PR. No worries if you're busy, just making sure it didn’t slip through the cracks! |
Yeah, sorry it's taking so long to get to this. |
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
When initializing base types in a subinterpreters extra items are added to tp_dict which triggers incorrect behavior while running code under a subinterpreter. Here we're identifying which keys are not part of the original tp_dict in the main interpreter and removing them.
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
This bug has a severe impact on WSGI applications running on mod_wsgi, which uses subinterpreters to isolate the applications. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a few recommendations.
Misc/NEWS.d/next/Core and Builtins/2024-06-03-23-13-29.gh-issue-117482.p5mtQI.rst
Outdated
Show resolved
Hide resolved
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
…e-117482.p5mtQI.rst Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
Is the main issue the "tp" slots getting set during |
We only need to cleanup the dict slots when we're in a subinterpreter.
Slot wrappers should behave the same in the main interpreter as well as in all subinterpreters.
Moved cleanup to |
FYI, I tried out an alternative approach that is a bit more direct: gh-121602. I think the approach I took is more focused and probably less fragile. It also addresses a potential problem related to reinitializing the runtime. |
Looks great! |
Closing in favor of #121602 |
Thanks for working on this, @mliezun! The analysis you did made it a lot easier to find a solution for this. While we didn't end up using your fix, it was helpful for identifying what needed to happen. |
Thank you for looking into it! |
I think an alternative option is for |
Alright done that just now. #117482 (comment) |
Solves #117482
Problem description
When initializing base types in a subinterpreters extra items are added to tp_dict. For example, in the linked issue we use the following script to identify the problem:
While running on a subinterpreter
int.__str__
points to a slot inint
itself instead of pointing to the one defined inobject
as it should be.Tracking down the root cause
As can be seen in #117482 (comment), the problem was introduced in commit de64e75.
By further debugging the specific parts modified in that commit I found that the problem lies in the type_ready function.
In previous versions of Python
tp_dict
was shared between subinterpreters. Now each interpreter has their own copy for static builtin types.Let's take
int.__str__
as an example to see what's happening:type_ready
function gets executed as part of the main interpreter initialization process.int
type:tp_dict
gets filled by type_ready_fill_dict, one of the thing this function does is lookup which slots are not empty and add them to the dict, forint
thetp_str
slot is empty.tp_str
slot is not empty for int, it was filled by step 3. So it gets added to the dict bytype_ready_fill_dict
.The problem here is that the code expects
type_ready_fill_dict
to be called beforetype_ready_inherit
. Which is true for the main interpreter but not for the subinterpreter. Also, this is caused because all the slots inint
are shared except fortp_dict
which is stored in each interpreter.Tracking down all affected types
I ran a little script to track down all affected types and the differences in their base dicts:
This was used to compare outputs with previous python versions (<3.12) and to check if the proposed solution worked correctly.
Proposed Solution
The current proposed solution is to add a function called
cleanup_tp_dict
that checks if we're running in a subinterpreter, then identifies what extra keys were added to the static typetp_dict
and removes them.This function is called at the end of
type_ready_fill_dict
.Alternative solutions
PyObject_HashNotImplemented
to indicate that certain slots should not be filled, liketp_str
forint
.type_ready_fill_dict
on the main interpreter and just make a copy of the original one in subinterpreters.Conclusion
The proposed solution feels quite hacky, but gets the work done. This is my first contribution to CPython and I don't have much experience with it so there surely might be a better way to do it, there's probably some internal knowledge that I'm missing.
I'm happy to work on a different approach if pointed in the right direction.
Thank you for your time and all your effort building Python!
int.__str__
behaviour inside sub-interpreters #117482