Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API to allow extensions to set callback function on creation, modification, and destruction of PyFunctionObject #91053

Closed
mpage mannequin opened this issue Mar 1, 2022 · 8 comments
Labels
3.12 bugs and security fixes topic-C-API type-feature A feature request or enhancement

Comments

@mpage
Copy link
Mannequin

mpage mannequin commented Mar 1, 2022

BPO 46897
Nosy @carljm, @DinoV, @itamaro, @mpage

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2022-03-01.22:19:44.618>
labels = ['expert-C-API', 'type-feature', '3.11']
title = 'Add API to allow extensions to set callback function on creation, modification, and destruction of PyFunctionObject'
updated_at = <Date 2022-03-01.22:19:44.618>
user = '/~https://github.com/mpage'

bugs.python.org fields:

activity = <Date 2022-03-01.22:19:44.618>
actor = 'mpage'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['C API']
creation = <Date 2022-03-01.22:19:44.618>
creator = 'mpage'
dependencies = []
files = []
hgrepos = []
issue_num = 46897
keywords = []
message_count = 1.0
messages = ['414308']
nosy_count = 4.0
nosy_names = ['carljm', 'dino.viehland', 'itamaro', 'mpage']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue46897'
versions = ['Python 3.11']

@mpage
Copy link
Mannequin Author

mpage mannequin commented Mar 1, 2022

CPython extensions providing optimized execution of Python bytecode (e.g. the Cinder JIT) may need to hook into the lifecycle of function objects to determine what to optimize, invalidate previously-optimized functions, or free resources allocated for functions that no longer exist. For example, when inlining a function, the Cinder JIT will use the bytecode of the inlined function that was known at compile-time. If the bytecode for the inlined function changes at runtime (i.e. if __code__ was reassigned) the JIT needs to invalidate any code into which the function was inlined. We propose adding an API to allow extensions to set callbacks that will be invoked whenever functions are created, modified, or destroyed.

Proposed API:

typedef enum {
  PYFUNC_LCEVT_CREATED,
  PYFUNC_LCEVT_MODIFIED,
  PYFUNC_LCEVT_DESTROYED
} PyFunction_LifecycleEvent;

typedef enum {
  PYFUNC_ATTR_CODE,
  PYFUNC_ATTR_GLOBALS,
  PYFUNC_ATTR_DEFAULTS,
  PYFUNC_ATTR_KWDEFAULTS,
  PYFUNC_ATTR_CLOSURE,
  PYFUNC_ATTR_NOT_APPLICABLE,
} PyFunction_AttrId;

// A callback to be called in response to events in a function's lifecycle.
//
// The callback is invoked after a function is created and before the function 
// is modified or destroyed.
//
// On modification the third argument indicates which attribute was modified
// and the fourth argument is the new value.
// Otherwise the third argument is PYFUNC_ATTR_NOT_APPLICABLE and the fourth
// argument is NULL.
typedef void(*PyFunction_LifecycleCallback)(
  PyFunction_LifecycleEvent event, 
  PyFunctionObject* func,
  PyFunction_AttrId attr,
  PyObject* new_value);

void PyFunction_SetLifecycleCallback(PyFunction_LifecycleCallback callback);
PyFunction_LifecycleCallback PyFunction_GetLifecycleCallback();

@mpage mpage mannequin added 3.11 only security fixes topic-C-API type-feature A feature request or enhancement labels Mar 1, 2022
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@gvanrossum
Copy link
Member

@markshannon How does this proposal from @mpage fit into our current plans?

@erlend-aasland erlend-aasland added 3.12 bugs and security fixes and removed 3.11 only security fixes labels May 26, 2022
@markshannon
Copy link
Member

First we need to do something about comprehensions (and nested functions).
Function objects are created for each comprehension.
Ideally we would not create functions for list comprehensions, or treat them specially, as the function objects are inaccessible.
Closures are trickier, as they can be ephemeral, but are accessible.

Provided we can avoid hooking into the lifetime of these short live objects, then adding hooks makes sense.
We will need to handle much the same set of events as Cinder does.

I'd like to add these hooks in a principled way in the broader context of handling potential de-optimization events.
Maybe extending the API for dictionary watchers?

@carljm
Copy link
Member

carljm commented Jul 31, 2022

We have code in cinder's compiler to inline comprehensions instead of creating a function. It is a perf win but there is a semantic compromise in scoping / name visibility, not sure that would be acceptable.

Can you outline what you're thinking in terms of unified API? My intuition is that the needs and details of eg dict watching vs function watching are sufficiently different that separate APIs in PyDict_* and PyFunction_* will probably be simpler and clearer to use, but open to suggestions.

@gvanrossum
Copy link
Member

Ping? @markshannon ?

@markshannon
Copy link
Member

We have code in cinder's compiler to inline comprehensions instead of creating a function. It is a perf win but there is a semantic compromise in scoping / name visibility, not sure that would be acceptable.

I doubt that it would be in general, but for cases where the iteration variable is not shadowed or used outside the comprehension it might be. It would need a wider discussion.

Can you outline what you're thinking in terms of unified API?

It is a bit vague at the moment, but I do need to write it up properly. For now, I'm thinking that any object that is depended on, whether dict, function or class would be allocated an ID. Optimized code would depend on a set of IDs. If any object with an ID changes then the associated optimized code(s) would be invalidated. It should be possible to implement this reasonably efficiently with bloom filters, or radix trees, or some other suitable data structure.

How does Cinder handle this?

@markshannon
Copy link
Member

Why do functions need callbacks, rather than code objects?

Once we have implemented faster-cpython/ideas#446, we can effectively specialize calls to comprehensions and nested functions, without caring about the lifetimes of the individual function objects.

@carljm
Copy link
Member

carljm commented Sep 15, 2022

I'm thinking that any object that is depended on, whether dict, function or class would be allocated an ID. Optimized code would depend on a set of IDs. If any object with an ID changes then the associated optimized code(s) would be invalidated.

I don't think this type of API will be sufficient for us in general. We want to be able to do code-patching on many updates, not just invalidate all generated code on any change. One reason this matters is because it handles the problem of OSR for functions further up the stack. If, say, a global value changes, we want to patch the generated code at the point where that specific global value is loaded with an unconditional deopt instruction, which will result in the correct behavior even if that optimized function is already mid-execution somewhere up the stack. So e.g. in the dict watchers API, we need details about what changed in the dict, not just the fact that the dict changed. Another example is that we do granular invalidation of our "inline" caches if the type whose information we cached is changed, we won't just throw away the generated code (which is independent of the caches and is still valid.)

How does Cinder handle this?

We use callback functions on relevant modifications to dictionaries, types, and functions, with custom handling appropriate to each case.

Why do functions need callbacks, rather than code objects?

I think the only case in which we actually depend on func-modified hooks today is if the function's __code__ is changed. Because we change a function's vectorcall entrypoint when we compile it, if its __code__ changes we need to reset to the default vectorcall entrypoint. I'm not sure if we have use cases in mind for hooking into other changes to funcs; maybe @mpage or @swtaarrs can weigh in if I'm missing something. We do modify MAKE_FUNCTION today, which in the future we might want to use a func-created hook for instead.

miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jul 3, 2023
…ers (pythonGH-106286)

(cherry picked from commit 5890621)

Co-authored-by: Carl Meyer <carl@oddbird.net>
carljm added a commit that referenced this issue Jul 3, 2023
…hers (GH-106286) (#106365)

gh-91053: make func watcher tests resilient to other func watchers (GH-106286)
(cherry picked from commit 5890621)

Co-authored-by: Carl Meyer <carl@oddbird.net>
carljm added a commit to carljm/cpython that referenced this issue Jul 3, 2023
* main: (167 commits)
  pythongh-91053: make func watcher tests resilient to other func watchers (python#106286)
  pythongh-104050: Add more type hints to Argument Clinic DSLParser() (python#106354)
  pythongh-106359: Fix corner case bugs in Argument Clinic converter parser (python#106361)
  pythongh-104146: Remove unused attr 'parameter_indent' from clinic.DLParser (python#106358)
  pythongh-106320: Remove private _PyErr C API functions (python#106356)
  pythongh-104050: Annotate Argument Clinic DSLParser attributes (python#106357)
  pythongh-106320: Create pycore_modsupport.h header file (python#106355)
  pythongh-106320: Move _PyUnicodeWriter to the internal C API (python#106342)
  pythongh-61215: New mock to wait for multi-threaded events to happen (python#16094)
  Document PYTHONSAFEPATH along side -P (python#106122)
  Replace the esoteric term 'datum' when describing dict comprehensions (python#106119)
  pythongh-104050: Add more type hints to Argument Clinic DSLParser() (python#106343)
  pythongh-106320: _testcapi avoids private _PyUnicode_EqualToASCIIString() (python#106341)
  pythongh-106320: Add pycore_complexobject.h header file (python#106339)
  pythongh-106078: Move DecimalException to _decimal state (python#106301)
  pythongh-106320: Use _PyInterpreterState_GET() (python#106336)
  pythongh-106320: Remove private _PyInterpreterState functions (python#106335)
  pythongh-104922: Doc: add note about PY_SSIZE_T_CLEAN (python#106314)
  pythongh-106217: Truncate the issue body size of `new-bugs-announce-notifier` (python#106329)
  pythongh-104922: remove PY_SSIZE_T_CLEAN (python#106315)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes topic-C-API type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

4 participants