Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add nanobind Python bindings #439

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

mnxn
Copy link

@mnxn mnxn commented Sep 27, 2024

Closes #433

This PR adds a new target, NanobindImath for the nanobind version of the pybind11 bindings. To build the nanobind bindings, the -DNANOBIND CMake option must be 'ON'. pybind11 bindings are still available.

As expected, nanobind is almost entirely compatible with the previous pybind11 bindings with only a few differences:

  • The def_readwrite member function is renamed to def_rw
  • The module type is renamed to module_. Probably so it doesn't collide with the C++20 modules keywords.
  • In CMake, calling nanobind_build_library(nanobind-static) is necessary to build a static library for the libNanobindImath... shared library to link with.

I couldn't test compilation time/runtime performance very reliably, but the resulting binaries are significantly smaller.

Less than half the size for the python module:

356K    lib/python3.12/site-packages/nanobindimath.cpython-312-x86_64-linux-gnu.so
740K    lib/python3.12/site-packages/pybindimath.cpython-312-x86_64-linux-gnu.so

And less than 1/10 of the size for the shared library.

88K     lib/libNanobindImath_Python3_12-3_2.so.30.3.2.0
1016K   lib/libPyBindImath_Python3_12-3_2.so.30.3.2.0

Both the nanobind and pybind11 bindings are fairly small compared to the Boost bindings, so these size ratios may not scale linearly, but I think it's promising.

Signed-off-by: Max Lantas <mnxndev@outlook.com>
Copy link

linux-foundation-easycla bot commented Sep 27, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

@cary-ilm
Copy link
Member

This is awesome, thanks!! I'll look through it in greater detail shortly.

@mnxn
Copy link
Author

mnxn commented Sep 27, 2024

I edited the CI matrix to build the nanobind version in addition to pybind11.

While I was working on initial PR, I also tried get the unit tests running for both versions of the bindings, but couldn't figure out how to structure the CMake files to import the generated Python module without an error.

Signed-off-by: Max Lantas <mnxndev@outlook.com>
@lgritz
Copy link
Contributor

lgritz commented Sep 27, 2024

I'm unclear on the big picture strategy with regard to pybind11 and nanobind. Are you proposing that we build both sets? Or that we switch from pybind11 to nanobind completely? What's the compatibility story between the two? If some other project that uses pybind11 and expects to use the Imath data types from Python, does Imath switching to nanobind interfere with that? Do we have an enumeration of the reasons why a project would not be able to switch from pybind11 to nanobind?

@mnxn
Copy link
Author

mnxn commented Sep 27, 2024

Hi @lgritz,

The goal is to replace pybind11 completely. I only left the pybind11 directories since #433 mentioned duplicating the bindings and there are still open tickets for adding more class wrappers to the pybind11 version. The nanobind and pybind11 C++ API's are almost identical, so it should be straightforward to migrate even if any new pybind11 C++ files are created.

I don't believe there's a compatibility problem. The choice of nanobind vs. pybind11 is an internal implementation detail. Users of Imath in Python would only be directly interacting with Python objects that are wrappers for the underlying C++ data types. In other words, the Python library users would only see that there is a Python class named Box2d with the size method, they wouldn't know whether it was implemented through nanobind or pybind11.

Additionally, all of the pybind11 or nanobind code is statically linked into the resulting Python module and shared library. I can't see any reason that a user of the nanobind Imath Python bindings couldn't also use bindings to some other library written with pybind11.

@lgritz
Copy link
Contributor

lgritz commented Sep 27, 2024

But what about another C++ library that has a pybind11-mediated Python API, whose public APIs use the Imath classes?

I feel like at some point somebody told me that this was not going to work if one was nanobind and the other was pybind11. But I don't know enough to say why that is a problem.

@mnxn
Copy link
Author

mnxn commented Sep 28, 2024

Oh, I see what you're saying. In that case, no, nanobind doesn't have that type of compatibility.

nanobind is only largely compatible with pybind11 at the API level for C++. It doesn't claim to be compatible at the ABI-level. In fact, breaking with pybind11 internal memory layout is the main reason that nanobind is a separate project: it allows nanobind to introduce optimizations that wouldn't otherwise be possible.

If the reason for introducing pybind11 and moving away from Boost.Python was to make it possible for external projects that already use pybind11 to create bindings that accept Imath Python objects, then nanobind cannot be a replacement. For an external project to create bindings that are compatible with the Imath Python module, that project must use the same version of the Python binding library that Imath uses, whether that be Boost.Python, pybind11, or nanobind.

@mnxn
Copy link
Author

mnxn commented Sep 28, 2024

Do we have an enumeration of the reasons why a project would not be able to switch from pybind11 to nanobind?

To answer your earlier question, the only reason an external project can't switch to nanobind from pybind11 in order to write bindings compatible with the Imath Python module is if it uses one of the removed features.

@lgritz
Copy link
Contributor

lgritz commented Sep 28, 2024

A major issue is that nanobind ONLY allows python bindings of a C++ library. But pybind11 could do some other tricks, including letting you embed a python interpreter into an application. That's another one I'm very concerned about.

Anyway, there's no reason why nanobind isn't suitable for Imath in isolation. But I worry that it's unsuitable for other things that need to use Imath classes in their public APIs, or that need to embed python as an extension language for applications.

It feel like it's possible that what we'd need to do is try to get the whole local ecosystem (ASWF projects and some siblings) to all switch from pybind11 to nanobind, or else always need to deal with incompatibilities cropping up.

@mnxn
Copy link
Author

mnxn commented Sep 28, 2024

But pybind11 could do some other tricks, including letting you embed a python interpreter into an application.

There are at least a few instances of this in the ASWF repos: /~https://github.com/search?q=org%3AAcademySoftwareFoundation+language%3AC%2B%2B+pybind11%2Fembed&type=code

I suppose those files could be rewritten to use the CPython embedding API directly since that's basically what pybind11 is doing, but it may be difficult to replicate some behavior depending on how the pybind11 embedding feature interacts with the generated bindings.

It feel like it's possible that what we'd need to do is try to get the whole local ecosystem (ASWF projects and some siblings) to all switch from pybind11 to nanobind, or else always need to deal with incompatibilities cropping up.

Yes, that seems like the only way a nanobind migration could work. Unfortunately, within the ASWF repos alone, there are already 20 pybind11-generated Python modules: /~https://github.com/search?q=org%3AAcademySoftwareFoundation+%28path%3A*.cpp+OR+path%3A*.cc%29+PYBIND11_MODULE%28&type=code

I wouldn't be surprised if at least one of the projects used other pybind11 features that were also removed by nanobind. Migrating all of these projects would be a massive undertaking that would involve creating workarounds for all of the missing features.

I don't see a path forward for this pull request.

@mnxn mnxn closed this Sep 28, 2024
@cary-ilm
Copy link
Member

Catching up on this now. This is the kind of analysis I was hoping to see, so it's a helpful contribution even if it goes no further.

The reason we're considering this in the first place is the desire to drop the current dependence on Boost.Python. Any new release of Imath without Boost.Python is going to require a major rewrite on the part of projects that uses the current Boost.Python PyImath C++ classes. ABI compatibility was never an objective. The thought was, if you're going to rewrite, would there be advantages to rewriting for nanobind instead of pybind11?

Alembic is the only such open source library I'm aware of, but there may be others. The other pybind11-based ASWF projects don't currently use PyImath's C++ API that I'm aware of, although maybe they would if there weren't compatibility problems. @lamiller0 contributed the original pybind11 Imath WIP so presumably there was some thought about the implications for Alembic.

I'm still unclear on exactly what this means in relation to Imath and to projects like Alembic that use the C++ Imath API, regarding pybind11 features missing in nanobind:

○ Multi-intepreter, Embedding: The ability to embed Python in an executable or run several independent Python interpreters in the same process is unsupported. Nanobind caters to bindings only. Multi-interpreter support would require TLS lookups for nanobind data structures, which is undesirable.

The next bit of evaluation to consider is exactly what the port of Alembic to a pybind11-based Imath would look like, and what wrinkles a nanobind alternative would introduce there.

@lamiller0
Copy link
Contributor

I think a minimal external C++ binding library that provides 1 method that depends on PyImath bindings to take a list of V3d from python, use those to calculate the bounds and then return that to python should be good enough to demonstrate external compatibility for most other libraries. Alembic is mostly reliant on PyImath for the usual data types, V3f, V3d, M33d, M44d, etc.

We had a first pass at a PR for supporting pybind11 but it never made its way into release because we were unsure of how to ingest PyImath in a familiar way. It doesn't need to stay pybind11, we would be willing to switch over to a different binding as long as it was compatible with a future PyImath.

@mnxn
Copy link
Author

mnxn commented Sep 30, 2024

I think a minimal external C++ binding library that provides 1 method that depends on PyImath bindings to take a list of V3d from python, use those to calculate the bounds and then return that to python should be good enough to demonstrate external compatibility for most other libraries.

I'll try to prepare this in the near future.

If Alembic is the only major project to bind against the Imath API in this way, then that certainly changes the scope of this change.

@mnxn mnxn reopened this Sep 30, 2024
@mnxn mnxn marked this pull request as draft September 30, 2024 20:40
@mnxn
Copy link
Author

mnxn commented Oct 23, 2024

Sorry for the delay. I implemented a small example external library in nanobind and pybind11 here: /~https://github.com/mnxn/imath-python-binding-example

Both the nanobind and pybind11 bindings are able to bind against Imath types without a problem. Both of these work:

  • Imath+nanobind V3d Python object -> External library function via nanobind -> Imath+nanobind Box3d Python object
  • Imath+pybind11 V3d Python object -> External library function via pybind11 -> Imath+pybind11 Box3d Python object

Couldn't figure out the proper way to reference Imath from an external CMake project, but if you install this branch's version of Imath to /usr/local/lib/..., then run.bash should print the successful results:

nanobind:
  (min) <1.0, 1.0, 1.0>
  (max) <3.0, 3.0, 3.0>

pybind11:
  (min) <1.0, 1.0, 1.0>
  (max) <3.0, 3.0, 3.0>

As expected, mixing and matching nanobind & pybind11 doesn't work: passing a Imath+nanobind V3d to external library bindings created by pybind11 or vice versa results in a TypeError: /~https://github.com/mnxn/imath-python-binding-example/blob/9a3853072bfeb72a3edb60feaca099e05fed1a64/bounds.py#L34-L58

As long as Imath and Alembic can agree on which binding library to use, I don't see anything blocking a migration away from Boost.Python.

@cary-ilm
Copy link
Member

Thank you for the analysis, it helps clarify the situation. I'm inclined to say that Imath should continue migrating to pybind11 and drop consideration of nanobind, but it's worth a larger discussion within the ASWF to confirm that's the consensus direction. And we'll be sure to coordinate with Alembic.

Since Imath is a rather static library, not expected to evolve significant new functionality, it's not out of the question to continue to support multiple binding libraries.

@wjakob
Copy link

wjakob commented Nov 7, 2024

Chiming in as an outsider but someone involved with both binding projects. I strongly recommend against starting new projects on pybind11 and migrating existing projects to nanobind if possible. Pybind11 has severe and unfixable problems. It will only get worse in the future with the significant changes happening in the Python ecosystem.

@lgritz
Copy link
Contributor

lgritz commented Nov 7, 2024

"as someone involved with both binding projects"

@lgritz
Copy link
Contributor

lgritz commented Nov 7, 2024

@wjakob Unfortunately, to the best of my knowledge, nanobind's minimum python version is 3.8. A lot of our projects have, ugh, only recently been able to drop python 2.7, with 3.7 being the new minimum and quite frequently a constraint. So it might take a bit before we can solidly consider Python 3.9 our new minimum and thus fully unlock the possibility of switching to nanobind.

On a slightly different point, is there really no way to make nanobind and pybind11 able to converse? The whole migration could go faster it if could be done incrementally, project by project. The need to have lots of projects switch either simultaneously or not at all definitely holds things back. I'm thinking of cases where there are projects A and B, and B needs to use public types of A. They both have to be nanobind or both pybind11, but not mixed, correct?

@wjakob
Copy link

wjakob commented Nov 7, 2024

Python 3.7 is difficult to support since it doesn't have vector calls, which are a key architectural feature driving nanobind (those were introduced in 3.8. While nanobind supports 3.8, that version is probably not a good target since it is EOL since Oct 2024).

There has been some preliminary discussion abou basic interop. I can't yet say how that develop, but such a feature might alleviate the all-or-nothing requirement of porting a set of interconnected packages. You would really just need this though if you simultaneously want to use pybind11+nanobind extensions containing bindings of functions that take instances of the same C++ class (e.g. MyClass), which implies that those binding projects use a common set of C++ header files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prototype nanobind bindings for Imath
5 participants