-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add API to allow extensions to set callback function on creation, modification, and destruction of PyFunctionObject #91053
Comments
CPython extensions providing optimized execution of Python bytecode (e.g. the Cinder JIT) may need to hook into the lifecycle of function objects to determine what to optimize, invalidate previously-optimized functions, or free resources allocated for functions that no longer exist. For example, when inlining a function, the Cinder JIT will use the bytecode of the inlined function that was known at compile-time. If the bytecode for the inlined function changes at runtime (i.e. if __code__ was reassigned) the JIT needs to invalidate any code into which the function was inlined. We propose adding an API to allow extensions to set callbacks that will be invoked whenever functions are created, modified, or destroyed. Proposed API:
|
@markshannon How does this proposal from @mpage fit into our current plans? |
First we need to do something about comprehensions (and nested functions). Provided we can avoid hooking into the lifetime of these short live objects, then adding hooks makes sense. I'd like to add these hooks in a principled way in the broader context of handling potential de-optimization events. |
We have code in cinder's compiler to inline comprehensions instead of creating a function. It is a perf win but there is a semantic compromise in scoping / name visibility, not sure that would be acceptable. Can you outline what you're thinking in terms of unified API? My intuition is that the needs and details of eg dict watching vs function watching are sufficiently different that separate APIs in |
Ping? @markshannon ? |
I doubt that it would be in general, but for cases where the iteration variable is not shadowed or used outside the comprehension it might be. It would need a wider discussion.
It is a bit vague at the moment, but I do need to write it up properly. For now, I'm thinking that any object that is depended on, whether dict, function or class would be allocated an ID. Optimized code would depend on a set of IDs. If any object with an ID changes then the associated optimized code(s) would be invalidated. It should be possible to implement this reasonably efficiently with bloom filters, or radix trees, or some other suitable data structure. How does Cinder handle this? |
Why do functions need callbacks, rather than code objects? Once we have implemented faster-cpython/ideas#446, we can effectively specialize calls to comprehensions and nested functions, without caring about the lifetimes of the individual function objects. |
I don't think this type of API will be sufficient for us in general. We want to be able to do code-patching on many updates, not just invalidate all generated code on any change. One reason this matters is because it handles the problem of OSR for functions further up the stack. If, say, a global value changes, we want to patch the generated code at the point where that specific global value is loaded with an unconditional deopt instruction, which will result in the correct behavior even if that optimized function is already mid-execution somewhere up the stack. So e.g. in the dict watchers API, we need details about what changed in the dict, not just the fact that the dict changed. Another example is that we do granular invalidation of our "inline" caches if the type whose information we cached is changed, we won't just throw away the generated code (which is independent of the caches and is still valid.)
We use callback functions on relevant modifications to dictionaries, types, and functions, with custom handling appropriate to each case.
I think the only case in which we actually depend on func-modified hooks today is if the function's |
…ers (pythonGH-106286) (cherry picked from commit 5890621) Co-authored-by: Carl Meyer <carl@oddbird.net>
* main: (167 commits) pythongh-91053: make func watcher tests resilient to other func watchers (python#106286) pythongh-104050: Add more type hints to Argument Clinic DSLParser() (python#106354) pythongh-106359: Fix corner case bugs in Argument Clinic converter parser (python#106361) pythongh-104146: Remove unused attr 'parameter_indent' from clinic.DLParser (python#106358) pythongh-106320: Remove private _PyErr C API functions (python#106356) pythongh-104050: Annotate Argument Clinic DSLParser attributes (python#106357) pythongh-106320: Create pycore_modsupport.h header file (python#106355) pythongh-106320: Move _PyUnicodeWriter to the internal C API (python#106342) pythongh-61215: New mock to wait for multi-threaded events to happen (python#16094) Document PYTHONSAFEPATH along side -P (python#106122) Replace the esoteric term 'datum' when describing dict comprehensions (python#106119) pythongh-104050: Add more type hints to Argument Clinic DSLParser() (python#106343) pythongh-106320: _testcapi avoids private _PyUnicode_EqualToASCIIString() (python#106341) pythongh-106320: Add pycore_complexobject.h header file (python#106339) pythongh-106078: Move DecimalException to _decimal state (python#106301) pythongh-106320: Use _PyInterpreterState_GET() (python#106336) pythongh-106320: Remove private _PyInterpreterState functions (python#106335) pythongh-104922: Doc: add note about PY_SSIZE_T_CLEAN (python#106314) pythongh-106217: Truncate the issue body size of `new-bugs-announce-notifier` (python#106329) pythongh-104922: remove PY_SSIZE_T_CLEAN (python#106315) ...
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: