Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Make Rust flyteidl PyO3-ready #5525

Draft
wants to merge 126 commits into
base: flyrs
Choose a base branch
from

Conversation

austin362667
Copy link
Contributor

@austin362667 austin362667 commented Jun 27, 2024

Tracking issue

#5344

Why are the changes needed?

This pull request focuses on modifying the Rust flyteidl library to pave the way for the introduction of flyteidl-rust as a Python binding. The changes address issues raised in issues and serve as a foundation for the PR.

What changes were proposed in this pull request?

Key Modifications:

  1. Custom Code Generator:

    Added a custom tonic_builder gRPC code generator. The code for this generator can be found in src/gen.rs. Will try to refactor into buf generator in follow-up PRs.

  2. Compile Well-Known Types:

    Background:

    1. Utilize the #[pyclass] attribute macro on our Rust structure. This macro implements the PyClass trait for the underlying structure "under the hood".
    2. Implementing the PyClass trait for a structure from an external crate (like prost_types) can lead to orphan rule violations in our local crate. The Rust compiler will not permit implementation of traits for types outside of the crate where the type is defined. (This is known as the "orphan rule".)

    Avoiding Orphan Rules:

    In some protobuf builders, like tonic_builder, there's an option to enable "compile well-known types". This forces the builder to compile well-known types (WKTs) directly using Rust primitive types instead of relying on external structures compiled from the prost_types crate.

    Alternatives Considered:

    One alternative is to create local wrappers around each external structure, as suggested by the PyO3 community. However, this solution becomes impractical when dealing with nested member fields within these structures. Because we still have to add PyClass trait all the way down to the nesting hierarchy. We'll elaborate more on gRPC client cases.

  3. implement PyClass Traits for Box<T>:

    T already has PyClass Traits. Box<T> needs but does not have PyClass Traits.
    For example, in flyteidl.core.rs, Box<T>: LiteralType, Literal, Union, Scalar, ConjunctionExpression, BooleanExpression, IfBlock, BranchNode, ArrayNode, a total of 9 types will be on the heap.

    What situations will be converted to boxes?

    • Recursive types, such as ArrayNode, can hold references to other Nodes and are converted to ArrayNode {node: Option<Box<Node>>, ...}.

    • This poses a challenge for PyO3, as it requires a PyClass to hold a reference to another PyClass. This is a Rust Non-lifetime-free issue.

    • While this PR supports Cell<T>, but didn't support Box<T>, it also highlights the difficulties of supporting Box<T>.

    • The PyO3 roadmap on addressing these issue seems like this way.

    Solution:

    Introduce conversion implementations and lifetime markers for the 9 types that use Box to prevent refs from outliving. For example, for Box, implement these two conversion traits instead of directly adding PyClass traits to Box:

    impl pyo3::conversion::IntoPy<pyo3::PyObject> for Box<Node> {
        fn into_py(self, py: pyo3::marker::Python<'_>) -> pyo3::PyObject {
            self.as_ref().clone().into_py(py)
        }
    }
    impl<'py> pyo3::conversion::FromPyObject<'py> for Box<Node> {
        fn extract_bound(obj: & pyo3::Bound<'py, pyo3::types:: PyAny>) ->  pyo3::PyResult<Self> {
            Ok(Box::new(Node::extract_bound(obj).unwrap()))
        }
    }

    Prevents references from outliving their intended lifetime, ensuring memory safety and preventing potential memory leaks.

  4. IntoRequest Trait:
    For cases like following example,

    let request = tonic::Request::new(HelloRequest {
        name: "Tonic".into(),
    });
    let response = client.say_hello(request).await?;

    Based on Types implementing this trait can be used as arguments to client RPC
    We can directly pass them to gRPC services, like:

    client.get_feature(Request::new(Point {})); // can work.
    client.get_feature(Point {}); // can work, too.
  5. Tuple Variants in Complex Enum: PyO3 supports tuple enum variants after version v0.22.0

  6. Python submodules: Use latest feature Declarative modules introduced in version v0.22.0.


  1. Optional: Concretize Generic Types

    Challenges:

    1. Each instantiation, for example: GenericType<i32>, GenericType<f64>, would have to have a different Python type object. This is an infinite set, and code would need to be generated for each instantiation. It isn't trivia for PyO3 to support.
    2. In our case, for any gRPC client service stubs may accepts arbitrary Channel, ChannelInterceptor, etc. We must specify the underlying concrete type.

    The Macro Solution

    1. Define Concrete Structures: We can create new structures (AdminStub in the example) with predefined concrete types for each gRPC stub. This allows specifying the exact types needed for PyO3 interaction.

    2. Using Macros for Automation: The following example leverages a Rust macro named concrete_generic_structure! to automate the creation of these concrete structures. It takes arguments for the structure name, the generic type it wraps, and the concrete type used. Additionally, it allows defining methods associated with the stub, specifying the request and response types.

    • The provided code demonstrates how the macro is used to create an AdminStub structure. This structure wraps the AdminServiceClient generic type, and specifies tonic::transport::Channel as the concrete type for the channel. Additionally, it defines two methods (create_task and get_task) with their specific request and response types.

      macro_rules! concrete_generic_structure {
      ($name:ident, $generic:ident, $type:ty, $( ($method_name:ident, $request_type:ty, $response_type:ty) ),* ) => {
        #[pyo3::pyclass]
        #[derive(Debug)]
        pub struct $name {
            pub inner: $generic<$type>,
            pub rt:  Runtime,
        }
      }
      
      concrete_generic_structure!(
        AdminStub,
        AdminServiceClient,
        tonic::transport::Channel,
        (
            create_task,
            flyteidl::admin::TaskCreateRequest,
            flyteidl::admin::TaskCreateResponse
        ),
        (
            get_task,
            flyteidl::admin::ObjectGetRequest,
            flyteidl::admin::Task
        )
      )

    Trade-offs:

    • Predefined Stub Support: This approach requires defining the stubs you want to support beforehand (e.g., defining Interceptor for channels).

How was this patch tested?

Setup process

Screenshots

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

flyteorg/flytekit#2536

Docs link

…` crate

Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
…ding

Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
@austin362667 austin362667 changed the title Add PyO3-ready Rust structures for building flyteidl-rust Python binding package Prepare PyO3-ready structures for building flyteidl-rust binding Jun 27, 2024
Copy link

codecov bot commented Jun 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 36.51%. Comparing base (bed761c) to head (4b5e2b1).

Additional details and impacted files
@@            Coverage Diff             @@
##            flyrs    #5525      +/-   ##
==========================================
+ Coverage   36.17%   36.51%   +0.34%     
==========================================
  Files        1302     1302              
  Lines      109492   129155   +19663     
==========================================
+ Hits        39606    47167    +7561     
- Misses      65748    77850   +12102     
  Partials     4138     4138              
Flag Coverage Δ
unittests-datacatalog 51.58% <ø> (+0.21%) ⬆️
unittests-flyteadmin 54.21% <ø> (-1.12%) ⬇️
unittests-flytecopilot 11.73% <ø> (-0.45%) ⬇️
unittests-flytectl 62.47% <ø> (+0.18%) ⬆️
unittests-flyteidl 6.81% <ø> (-0.27%) ⬇️
unittests-flyteplugins 53.48% <ø> (+0.16%) ⬆️
unittests-flytepropeller 42.57% <ø> (+0.82%) ⬆️
unittests-flytestdlib 54.75% <ø> (-0.60%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@austin362667 austin362667 changed the title Prepare PyO3-ready structures for building flyteidl-rust binding [wip] Prepare PyO3-ready structures for building flyteidl-rust binding Jul 5, 2024
}

#[pyclass(subclass, name = "RawSynchronousFlyteClient")]
pub struct RawSynchronousFlyteClient {
Copy link
Contributor Author

@austin362667 austin362667 Jul 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entry point that should wrap all the gRPC service stubs.

}

// #[macro_export]
// macro_rules! concrete_generic_structure {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At present, we don't need to specify the concrete type for generics parameterization, since we output the wrapped FlyteRemote as binding class instead of each gRPC stubs. So the macro oncrete_generic_structure is commented out.

pub fn SerializeToString(&self) -> Result<Vec<u8>, crate::_flyteidl_rust::MessageEncodeError>
{
// Bring `prost::Message` trait to scope here. Put it in outer impl block will leads to duplicated imports.
use prost::Message;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check here: tokio-rs/prost#525


#[pyclass]
#[pyo3(name = "FlyteUserException", subclass, extends = pyo3::exceptions::PyException)] // or PyBaseException
pub struct PyFlyteUserException {}
Copy link
Contributor Author

@austin362667 austin362667 Jul 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XREF on how to raise custom error from rust and catch them in python: PyO3/pyo3#4165

@austin362667 austin362667 changed the title [wip] Prepare PyO3-ready structures for building flyteidl-rust binding [WIP] Make Rust flyteidl PyO3-ready Jul 30, 2024
Signed-off-by: Austin Liu <austin362667@gmail.com>

fmt & clippy

Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
…e416bb750efd227e4f8b4302392c0d303

Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>

fixup workflow API & structure

Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>

fix `--repository` `testpypi`

Signed-off-by: Austin Liu <austin362667@gmail.com>

fix `--repository` `testpypi`

Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>

normalize path

Signed-off-by: Austin Liu <austin362667@gmail.com>

stringify path

Signed-off-by: Austin Liu <austin362667@gmail.com>

multi-string path

Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>

fix glob

Signed-off-by: Austin Liu <austin362667@gmail.com>

fix glob

Signed-off-by: Austin Liu <austin362667@gmail.com>

fix glob

Signed-off-by: Austin Liu <austin362667@gmail.com>

fix glob

Signed-off-by: Austin Liu <austin362667@gmail.com>

fix glob

Signed-off-by: Austin Liu <austin362667@gmail.com>

fix syntax

Signed-off-by: Austin Liu <austin362667@gmail.com>

fix syntax

Signed-off-by: Austin Liu <austin362667@gmail.com>

fix glob

Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>

specify minimum pyhton version to pypi

Signed-off-by: Austin Liu <austin362667@gmail.com>

specify minimum pyhton version to pypi

Signed-off-by: Austin Liu <austin362667@gmail.com>

specify minimum pyhton version to pypi

Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>

fix testpypi url

Signed-off-by: Austin Liu <austin362667@gmail.com>

fix testpypi url

Signed-off-by: Austin Liu <austin362667@gmail.com>

fix testpypi url

Signed-off-by: Austin Liu <austin362667@gmail.com>

fix glob

Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>

add MATURIN_REPOSITORY_URL

Signed-off-by: Austin Liu <austin362667@gmail.com>

add MATURIN_REPOSITORY_URL

Signed-off-by: Austin Liu <austin362667@gmail.com>

add MATURIN_REPOSITORY_URL

Signed-off-by: Austin Liu <austin362667@gmail.com>

add MATURIN_REPOSITORY_URL

Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>

fmt

Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>

[project]
name = "flyteidl"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to keep flyteidl before migrating everything to flyteidl_rust. Could we create another file called pyproject-rust.toml

Creates a RawSynchronousFlyteClient based on destination endpoint that connects to Flyte gRPC Server.

:param endpoint: destination endpoint of a RawSynchronousFlyteClient to be connect to.
:return: a RawSynchronousFlyteClient instance with default platform configuration.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add methods here? (e.g. createTask, createWorkflow , etc). If so, I think we can do it in the follow-up PR.

@@ -0,0 +1,31 @@
// Copyright (c) 2015, Google Inc.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a README.md to the google/api directory to explain why we need these google protobuf files

Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Signed-off-by: Austin Liu <austin362667@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants