Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support multiple RayPipelines #4078

Merged

Conversation

zoltan-fedor
Copy link
Contributor

Related Issues

Proposed Changes:

Currently you can only run one RayPipeline within one app, because the RayPipeline always tries to initialize a new Ray connection - if one is already initialized then it will simply fail.
See the Ray initialization at /~https://github.com/deepset-ai/haystack/blob/main/haystack/pipelines/ray.py#L80

Due to that, the app will fail when it starts to initialize the second RayPipeline, as the first one has already initialized the Ray connection and an error will be thrown:

  File "/app/raypipelines/qa_pipeline.py", line 87, in __init__
    self.raypipeline = RayPipeline.load_from_yaml(
  File "/hW1oTPAF/src/farm-haystack/haystack/pipelines/ray.py", line 193, in load_from_yaml
    return cls.load_from_config(
  File "/hW1oTPAF/src/farm-haystack/haystack/pipelines/ray.py", line 101, in load_from_config
    pipeline = cls(address=address, ray_args=ray_args or {}, serve_args=serve_args or {})
  File "/hW1oTPAF/src/farm-haystack/haystack/pipelines/ray.py", line 80, in __init__
    ray.init(address=address, **ray_args)
  File "/hW1oTPAF/lib/python3.10/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
    return func(*args, **kwargs)
  File "/hW1oTPAF/lib/python3.10/site-packages/ray/_private/worker.py", line 1255, in init
    ctx = builder.connect()
  File "/hW1oTPAF/lib/python3.10/site-packages/ray/client_builder.py", line 182, in connect
    client_info_dict = ray.util.client_connect.connect(
  File "/hW1oTPAF/lib/python3.10/site-packages/ray/util/client_connect.py", line 44, in connect
    raise RuntimeError(
RuntimeError: Ray Client is already connected. Maybe you called ray.init("ray://<address>") twice by accident?

Having two RayPipelines are now being made possible by this PR by checking if Ray has already been initialized and if yes, then we skip initialization (which would fail) (alternatively we could catch the error, but I believe it is nicer to do it this way).

Obviously this also means that the second RayPipeline's parameters will NOT be applied to the Ray connection, as the first RayPipeline has already opened that connection, so throwing a warning log is warranted when Ray is initialized, so the user is aware not the Ray connection is not initialized again.

How did you test it?

I have an app with two RayPipelines which now do run with this change and they were not before.

Notes for the reviewer

I guess this is pretty simple, the only twist is that the second RayPipelines's ray_args are ignored, as a Ray connection is already initialized and Ray doesn't allow you to have two of those - hence the warning I have included.

Checklist

  • I have read the contributors guidelines and the code of conduct
  • I have updated the related issue with new insights and changes
  • I added tests that demonstrate the correct behavior of the change
  • I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test:.
  • I documented my code
  • I ran pre-commit hooks and fixed any issue

@zoltan-fedor zoltan-fedor requested a review from a team as a code owner February 7, 2023 00:18
@zoltan-fedor zoltan-fedor requested review from julian-risch and removed request for a team February 7, 2023 00:18
@sjrl sjrl added the type:feature New feature or request label Feb 7, 2023
@julian-risch julian-risch changed the title Supporting multiple RayPipelines feat: Support multiple RayPipelines Feb 7, 2023
Copy link
Member

@julian-risch julian-risch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! 👍 I just added feat: to the PR's title according to our conventions.
It's good that the change contains the warning message. One could have extended the message with a comparison of the parameters that ray was initialized with before and the parameters that were provided but not used when the logging message is shown. If they are the same then it's no problem. If they differ one could even raise an Exception.

@julian-risch julian-risch merged commit a3016f0 into deepset-ai:main Feb 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic:pipeline type:feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: Support the use of multiple RayPipelines from the same app
3 participants