-
Notifications
You must be signed in to change notification settings - Fork 787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python Interface (.pyi) generation and runtime inspection #2454
Comments
@CLOVIS-AI just a ping to say I haven't forgotten about this; have been ill / busy and was just about cleared through the backlog enough to read this when #2481 came up. I think I need to push through some security releases first, after which my plan was to finish #2302 and then loop back here with us ready to support a syntax for annotations. Sorry for the delay. |
@davidhewitt don't worry about the full review for now, it's just a prototype. If you have time, please just read this issue and give me your general feedback on the idea. If it seems good with you, I'll be able to start writing a real PR for at least a part of it and we can do a full review then 👍 |
Hey, so I finally found a moment to sit down and think about this. Thank you for working on this and for having patience with me. This looks great, I think this is definitely the right way to go. In particular splitting into two traits for the input/output I think is correct. Some thoughts and questions:
Overall, yes, I'm happy to proceed with this - as a first step I'd suggest we get |
I was thinking of feature-gating the macro generation but not the inspection API itself (so you would be able to construct inspection APIs yourself in all cases, but would need to enable the automatic implementation). I assume that the API itself will not have any significant effect on compile-time, since it's just normal structs. What do you think? The About custom generics: the approach in this documentation couldn't, but the one in #2490 can (however, it must be through user input, I don't see a way in which the macros could guess that). Combining external information with the generated ones will be trivial: because the macros will generate an implementation of the inspection API, and the .pyi generation will take an implementation as parameter, users can simply edit the generated implementation before passing it to the .pyi generation. |
It seems like #2490 will be merged soon. I won't have a lot of time on my hands in the close future, so if someone else wants to help in the meantime, the next big question is the way to represent the program (Python classes, Python methods, Python modules) as Rust structures. My prototypes are close to solving the problem, except that I'm not a fan of how they deal with modules. The structures themselves seem fine, but the way to convert a |
Not sure if it relevant here: I have written a small python script to generate type stubs from pyo3 libraries with doc strings including type annotations (using the |
@Tpt That's great! However if I understand correctly you still have to declare the type twice (first as a Rust type, then as a Python type in the documentation), which is error-prone, and what this issue tries to avoid. I agree that it's already a great step up from the current situation of writing the .pyi entirely manually. |
@CLOVIS-AI Yes! Exactly. Indeed, avoiding to duplicate types would be much better. I wanted to get something working quickly for now instead of having to enter the auto generation from Rust rabbit hole. |
Would love to have this! |
Looking forward to the features! |
Hi, I have changed workplace and do not have time to contribute to this project anymore. If someone wants to continue this PR, please feel free to. My prototype is still online, and the outline described here should be good. |
I'm not experienced enough to help on this, just testifying about my use case it would be of great help. Well in the meantime, I'm going to write the |
2882: inspect: gate behind `experimental-inspect` feature r=davidhewitt a=davidhewitt This is the last thing I want to do before preparing 0.18 release. The `pyo3::inspect` functionality looks useful as a first step towards #2454. However, we don't actually make use of this anywhere within PyO3 yet (we could probably use it for better error messages). I think we also have open questions about the traits which I'd like to resolve before committing to these additional APIs. (For example, this PR adds `IntoPy::type_output`, which seems potentially misplaced to me, the `type_output` function probably wants to be on a non-generic trait e.g. `ToPyObject` or maybe #2316.) As such, I propose putting these APIs behind an `experimental-inspect` feature gate for now, and invite users who find them useful to contribute a finished-off design. Co-authored-by: David Hewitt <1939362+davidhewitt@users.noreply.github.com>
I've played around with #2447 in the last few days and I tried to fix some failed tests. I got stuck at missing <crate::PyResult<&crate::PyAny,> as _pyo3::conversion::IntoPy<_>>::type_output() Simply providing I was thinking about move pub trait WithTypeInfo {
fn type_output() -> TypeInfo;
fn type_input() -> TypeInfo;
} with |
@op8867555 good question, and I'm not sure I can give you the answer easily. The downside of moving into a separate trait is that you might find without specialization this creates a lot of work. Having the methods on the I think the best answer is - if you're willing to give it a go, please do, and let's see how that works out :) |
I've tried the separate trait approach, and it solved the Also, I've tried embedding field info into Footnotes |
Note that this is very much the case in PyO3 that
This setup may potetentially work: struct TypeAnnotation<T>(PhantomData<T>);
impl<T> WithTypeInfo for &'_ TypeAnnotation<T> {
fn type_input() -> TypeInfo { TypeInfo::Any }
fn type_output() -> TypeInfo { TypeInfo::Any }
} and specific implementations can then use
Yep that should work fine 👍 |
Oh, I didn't notice that 😅 . Are there any other specialization cases PyO3 creates?
I tried this (with some modification1) and didn't manage to make it work with user defined datatypes (e.g. provide type annotation for a rust enum like this). There will be an error when trying to provide an impl for a non-pyclass datatype since both Footnotes |
I would like to offer a possible alternative path: Use the "structured representation" of types in json format (nightly feature), similar of pavex. Since the type are used outside of the crate, you don't need to have any rust code generated, so having the final representation of types like this would avoid lots of problems and rough edges of macro based solution. Also, the crate would not need to be nightly to have this, only have a nightly installation to run the script to build the type hints. |
If I understand correctly, the "structured representation" of types is the JSON output of Rustdoc. If yes, I find it is definitely an interesting idea, thank you! I see two advantages: 1. it does not require to play with the cdylib objects to emit introspection data 2. it offers a full view of the source code including elements only built on a given target. However, I see a major downside: the approach of making the macro emit the introspection data allows to avoid a lot of code duplication: the piece of the responsible to emit the cpython-compatible data structure (class descriptor...) is also responsible to emit the introspection data, allowing to have a single place of definition. An external introspection system based on Rustdoc JSON would have to reimplement a big chunk of this logic, making discrepancies easier to introduce and, probably, leading to a significant amount of slightly duplicated code. @davidhewitt What do you think about it? |
Yep, that's right.
Having the same logic split in 2 places indeed looks bad. |
I made an alternative option for just creating the pyi file using a proc macro. It has a lot of rough edges but it works for our use case. |
We've release our stub file generator crate: This crate try to extract type information in Rust side using proc-macro, and gather these information with inventory crate like |
@termoshtt Amazing! Thank you! If I try to summarize the differences between your crate and my PR #3977:
|
@Tpt have you looked at the linked PRs in the initial issue? They can manage generic types with no issues. |
@CLOVIS-AI Yes! Thank you so much for them. They have been a great inspiration. If I understood them correctly, they follow the same approach as |
I apologize for the (low-key) spam, but...thank you very much @termoshtt! This is very much what we've been looking for in our projects and should save us quite a bit of time once we get this up and rolling! |
@Tpt Thanks summarize! I apologize if I'm mistaken, as I haven't fully read through the #3977 , but my brief response is as follows
Yes. In addition, our macro
would be true. Honestly, I never considered the multi-crate case.
Since I think generated stub file is platform independent usually, I intended to use #[cfg_attr(feature = "stub_gen", gen_stub_pyclass)]
#[pyclass]
struct A; Then |
Yes. My long-term goal was to have it be executed by Maturin as part of the build step, so the result could be embedded in the package directly. |
@termoshtt Thank you! I agree with all you said. I think the difference of design between your crate and my MR is mostly because I wanted to support the multi-crate use case and having stubs different by platforms (I got some code triggering these two edge cases). If we make the choice to don't support these two use cases, your design is indeed much better. |
It would be awesome to upstream it into PyO3 some day! |
Hey everyone. I have just tried out the
I struggled with the following usecase: #[pyfunction]
#[gen_stub_pyfunction]
pub fn just_some_fn() -> ThirdPartyType {...} To me, the whole point is to reduce the amount of manual documentation which needs to be created which is why I want to fully rely on the derive macros. Currently I can not seem to find how to use them in this context. I also tried to manually insert entries with As a suggested improvement: consider using To summarize: |
Just wanted to add another use case of automatically generated stub files. To generate API documentations, libraries (e.g., Because |
Why not executing stub generation just before |
Hi!
This issue is going to be a summary of my prototypes to generate Python Interface files ('stub files', '.pyi files') automatically. The prototypes are available as #2379 and #2447.
#[pyclass]
structs (which methods exist, what arguments do they accept) to be read at run-time by the stub generator.I'm presenting the results here to get feedback on the current approach. I'm thinking of extracting parts of the prototypes as standalone features and PRs.
Progress
Accessing type information at runtime:
#[pyclass]
Accessing structural information at runtime:
#[pyclass]
and#[pymethods]
Python interface generation:
Summary
The final goal is to provide a way for developers who use PyO3 to automatically generate Python Interface files (.pyi) with type information and documentation, to enable Rust extensions to 'feel' like regular Python code for end users via proper integration in various tools (MyPy, IDEs, Python documentation generators).
I have identified the following steps to achieve this goal. Ideally, each step will become its own PR as a standalone feature.
List[Union[str]]
, not justPyList
).1 and 2 are independent, 3 and 4 are independent.
Full type information
The goal of this task is to provide a simple way to access the string representation of the Python type of any object exposed to Python. This string representation should follow the exact format of normal Python type hints.
First, a structure representing the various types is created (simplified version below, prototype here):
PyO3 already has traits that represent conversion to/from Python:
IntoPy
andFromPyObject
. These traits can be enhanced to return the type information. The Python convention is that all untyped values should be considered asAny
, so the methods can be added withAny
as a default to avoid breaking changes (simplified version below, prototype here):The rationale for adding two different methods is:
derive(FromPyObject)
), so adding the method to only one of the trait would not work in all cases,Mapping<K, V>
as input andDict<K, V>
as output. Using two different methods supports this use case out-of-the-box.After this is implemented for built-in types (prototype here), using them becomes as easy as
format!("The type of this value is {}", usize::type_input())
which gives"The type of this value is int"
.Inspection API
This section consists of creating an API to represent Python objects.
The main entry point for users would be the
InspectClass
trait (simplified, prototype here):A similar trait would be created for modules, so it becomes possible to access the list of classes in a module.
This requires creating a structure for each Python language element (
ModuleInfo
,ClassInfo
,FieldInfo
,ArgumentInfo
…, prototype here).At this point, using this API would require instantiating all structures by hand.
Compile-time generation
Proc-macros can statically generate all information needed to automatically implement the inspection API: structural information (fields, etc) are already known, and type information can simply be delegated to the
IntoPy
andFromPyObject
traits, since all parameters and return values must implement at least one of them.Various prototypes:
#[pyo3(get, set)]
,This is done via two new traits,
InspectStruct
,InspectImpl
which respectively contain the information captured from#[pyclass]
and#[pymethods]
. Due to this, this prototype is not compatible withmultiple-pymethods
. I do not know whether it is possible to make it compatible in the future.Python Interface generator
Finally, a small runtime routine can be provided to generate the .pyi file from the compile-time extracted information (prototype here).
Thanks to the previous steps, it is possible to retrieve all information necessary to create a complete typed interface file with no further annotations from a user of the PyO3 library. I think that's pretty much the perfect scenario for this feature, and although it seemed daunting at first, I don't think it's so far fetched now 😄
The current state of the prototype is described here: #2447 (comment).
The text was updated successfully, but these errors were encountered: