Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should ml_meth be an union? #2129

Closed
mejrs opened this issue Jan 27, 2022 · 8 comments
Closed

Should ml_meth be an union? #2129

mejrs opened this issue Jan 27, 2022 · 8 comments

Comments

@mejrs
Copy link
Member

mejrs commented Jan 27, 2022

In #2126 (comment) I mentioned that transmuting function pointers makes me nervous.

The particular code example is this:

let wrapped_sum_as_string = PyMethodDef {
    ml_name: "sum_as_string\0".as_ptr() as *const c_char,
    ml_meth: Some(transmute::<_PyCFunctionFast, PyCFunction>(sum_as_string)),
    ml_flags: METH_FASTCALL,
    ml_doc: "returns the sum of two integers as a string\0".as_ptr() as *const c_char,
};

Representing the ml_meth field with an union should make this less tricky and error-prone:

#[repr(C)]
#[derive(Copy, Clone)]
pub union Foo {
    PyCFunction: PyCFunction,
    PyCFunctionWithKeywords: PyCFunctionWithKeywords,
    #[cfg(not(Py_LIMITED_API))]
    _PyCFunctionFast: _PyCFunctionFast,
    #[cfg(not(Py_LIMITED_API))]
    _PyCFunctionFastWithKeywords: _PyCFunctionFastWithKeywords,
    None: *const libc::c_void,
}

It then becomes:

let wrapped_sum_as_string = PyMethodDef {
    ml_name: "sum_as_string\0".as_ptr() as *const c_char,
    ml_meth: Foo {
        _PyCFunctionFast: sum_as_string,
    },
    ml_flags: METH_FASTCALL,
    ml_doc: "returns the sum of two integers as a string\0".as_ptr() as *const c_char,
};

What do you think? Are there any caveats to this?

@birkenfeld
Copy link
Member

The enum shouldn't be a problem, as long as equal repr is guaranteed, but I the transmute just mirrors what is done on the CPython side, and I don't see how it could be problematic...

@davidhewitt
Copy link
Member

The union looks nicer for avoiding the transmute, sure, yet I'm not sure it really wins us much. CPython essentially does the C equivalent of a transmute with the function pointer in ml_meth anyway depending on the value of ml_flags, so it's not like it's any more safe in reality.

Personally I prefer that the current definition exactly matches the C. If the majority want the union I can be won round. Note that changing it would be a breaking change, so I'd also argue if there's not much clear benefit it's not worth the ecosystem churn. (Not sure how many users are using ffi directly, though there are clearly some!)

@mejrs
Copy link
Member Author

mejrs commented Jan 28, 2022

Thanks for the feedback 😃 Let me elaborate some more.

What I like about it is that it's safer on the Rust side. For example, consider an user just doing this:

let wrapped_sum_as_string = PyMethodDef {
    ml_name: "sum_as_string\0".as_ptr() as *const c_char,
    ml_meth: transmute(sum_as_string)),
    ml_flags: METH_FASTCALL,
    ml_doc: "returns the sum of two integers as a string\0".as_ptr() as *const c_char,
};

The benefits of an union here are that:

  • It restricts the types you can put into it. Essentially it uses the type system to sidestep some of the unsafeness. You of course still have to get theml_flags correct.
  • It forces a compilation failure if you try using non-Py_LIMITED_API function pointer types on Py_LIMITED_API.
  • It explains in code what the possible variants for the argument are. This thing already is an enum - ml_flags is the tag and ml_meth are the possible variants. I'd rather express (some of) this in code than hope that users read the documentation.
  • It makes it easier to do the right thing. If you make it easy to mess up, people will do just that.

Currently there is barely any protection against accidentally changing a function to an incompatible type or just getting it wrong in the first place. If we can't even get the function types correct I think we should be helping users get this right.

Note that changing it would be a breaking change

0.16 will already be a breaking change in this area anyway - this code will stop compiling because of the fix to _PyCFunctionFast.

let wrapped_sum_as_string = PyMethodDef {
    ml_name: "sum_as_string\0".as_ptr() as *const c_char,
    ml_meth: Some(transmute::<_PyCFunctionFast, PyCFunction>(sum_as_string)),
    ml_flags: METH_FASTCALL,
    ml_doc: "returns the sum of two integers as a string\0".as_ptr() as *const c_char,
};

@adamreichold
Copy link
Member

CPython essentially does the C equivalent of a transmute with the function pointer in ml_meth anyway depending on the value of ml_flags, so it's not like it's any more safe in reality.

I would also like to add that this would probably be a net improvement for CPython as well, replacing the infinite set of types one can transmute/cast into by the finite set of types which are part of the union.

For us, it could also help immediately available documentation, e.g. the doc comments of the union variants could explain exactly which ml_meth flags their are associated with making the underlying enum/tagged union structure more visible.

@davidhewitt
Copy link
Member

Fair enough, the usability arguments are quite compelling. This wouldn't be the only place where the ffi bindings do have
slight differences to the C (even if they are ABI compatible). Want to put this into a PR and we can see how it looks?

I would also like to add that this would probably be a net improvement for CPython as well, replacing the infinite set of types one can transmute/cast into by the finite set of types which are part of the union.

Agreed, though I suspect that it's far too big a breaking change for CPython to consider doing that upstream.

@mejrs
Copy link
Member Author

mejrs commented Feb 1, 2022

Want to put this into a PR and we can see how it looks?

Sure 👍

@adamreichold
Copy link
Member

I think this can be closed now that #2166 was merged?

@davidhewitt
Copy link
Member

Yep, good catch!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants