Extending the Ray Serve integration to allow attributes for Serve deployments #2918

zoltan-fedor · 2022-07-28T22:48:15Z

Related Issue(s): closes #2917

We should be able to set Ray Serve attributes for the nodes of pipelines, like amount of GPU to use, max_concurrent_queries, etc.

Now this is possible from the pipeline yaml file for each node of the pipeline.

Proposed changes:

Extending the Ray Serve integration to allow attributes to be provided for the Serve deployments. Attributes like GPU usage (incl partial GPU usage), num_concurrent_queries and so on. See full list of attributes for the ray.serve.deployment() method in the Ray Serve API documentation

Pre-flight checklist

I have read the contributors guidelines
I have enabled actions on my fork
If this is a code change, I added tests or updated existing ones
If this is a code change, I updated the docstrings

…loyments This closes deepset-ai#2917 We should be able to set Ray Serve attributes for the nodes of pipelines, like amount of GPU to use, max_concurrent_queries, etc. Now this is possible from the pipeline yaml file for each node of the pipeline.

Python 3.8 was generating a different schema than Python 3.7 is creating in the CI. You MUST use Python 3.7 to generate the schemas, otherwise the CIs will fail.

ZanSara · 2022-07-29T08:52:39Z

Hey @zoltan-fedor! Thank you for this PR. I will review it shortly 🙂

ZanSara

There are a couple of catches here.

The core issue is that you added the deployment_kwargs block to the general pipeline schema, without restricting it to Ray pipelines. This needs two actions to be fixed:

We need a pipeline test that verifies that Ray attributes cannot be added to base pipelines. It could be something as simple as adding the following test in tests/pipelines/test_pipeline.yaml.py, approx line 996:

def test_load_yaml_ray_args_in_pipeline(tmp_path):
    with pytest.raises(PipelineConfigError) as e:
      pipeline = Pipeline.load_from_yaml(
          SAMPLES_PATH / "pipeline" / "ray.haystack-pipeline.yml", pipeline_name="ray_query_pipeline",
      )

We need to restrict this argument to Ray pipelines. This is rather easy, you just need to add deployment_kwargs alongside replicas in this part of the schema:

haystack/haystack/nodes/_json_schema.py

Lines 312 to 321 in b78db1c

    
           "oneOf": [ 
        
               { 
        
                   "not": {"required": ["extras"]}, 
        
                   "properties": { 
        
                       "pipelines": { 
        
                           "title": "Pipelines", 
        
                           "items": {"properties": {"nodes": {"items": {"not": {"required": ["replicas"]}}}}}, 
        
                       } 
        
                   }, 
        
               },

I hope it's all clear! Let me know if there's any question 🙂

One last thing: how about renaming the new param to serve_deployment_kwargs? Just to clarify it's related to Ray Serve. Otherwise deployment_kwargs sounds very generic.

haystack/json-schemas/haystack-pipeline-1.16.schema.json

haystack/json-schemas/haystack-pipeline.schema.json

haystack/pipelines/ray.py

@ZanSara

This was generated by the JSON generator, but based on @ZanSara's instructions, I am removing it.

… is failing

zoltan-fedor · 2022-07-29T11:46:42Z

HI @ZanSara ,
As requested, I have renamed deployment_kwargs to serve_deployment_kwargs.

I have also added serve_deployment_kwargs to :

"oneOf": [ 
     { 
         "not": {"required": ["extras"]}, 
         "properties": { 
             "pipelines": { 
                 "title": "Pipelines", 
                 "items": {"properties": {"nodes": {"items": {"not": {"required": ["replicas", "serve_deployment_kwargs"]}}}}}, 
             } 
         }, 
     },

But that will not make the exception tested in the test to be thrown, so this test will fail:

def test_load_yaml_ray_args_in_pipeline(tmp_path):
    with pytest.raises(PipelineConfigError) as e:
      pipeline = Pipeline.load_from_yaml(
          SAMPLES_PATH / "pipeline" / "ray.haystack-pipeline.yml", pipeline_name="ray_query_pipeline",
      )

Any recommendations?

ZanSara · 2022-07-29T12:13:57Z

You're right about the test! We should add a check in the pipeline loading for the extras: ray. I'm going to add a comment about where I'd like to see this check added, hold on

ZanSara · 2022-07-29T12:21:00Z

Ok that's not very easy to explain 😅 I'll add a commit myself to do that 👍

zoltan-fedor · 2022-07-29T12:24:15Z

Let me finish the removal of the replicas

zoltan-fedor · 2022-07-29T12:32:34Z

I have made the replicas change by moving it below the new serve_deployment_kwargs, but I don't think I am allowed to add the breaking label to this PR. Would you mind adding that?

ZanSara · 2022-07-29T12:34:06Z

Let me finish the removal of the replicas

Sorry! I haven't seen this. I'm done anyway, I'll be back in a while to see how the tests look like.

…-fedor/haystack into feature-ray-serve-deployment-args

ZanSara

Great work, thanks for it! 😊 I have no remarks.

zoltan-fedor added 7 commits July 28, 2022 18:32

Ran black and regenerated the json schemas

e886b40

Fixing the JSON Schema generation

be1f329

Trying to fix the schema CI test issue

d8d3a61

Fixing the test and the schemas

8fab1a1

Python 3.8 was generating a different schema than Python 3.7 is creating in the CI. You MUST use Python 3.7 to generate the schemas, otherwise the CIs will fail.

Merge the two Ray pipeline test cases

6f9b221

Generate the JSON schemas again after $ pip install .[all]

b3c9b73

sjrl added journey:advanced type:feature New feature or request labels Jul 29, 2022

ZanSara requested review from dmigo and ZanSara July 29, 2022 08:51

ZanSara added the topic:pipeline label Jul 29, 2022

ZanSara suggested changes Jul 29, 2022

View reviewed changes

haystack/json-schemas/haystack-pipeline-1.16.schema.json Outdated Show resolved Hide resolved

haystack/json-schemas/haystack-pipeline.schema.json Outdated Show resolved Hide resolved

dmigo reviewed Jul 29, 2022

View reviewed changes

haystack/pipelines/ray.py Outdated Show resolved Hide resolved

zoltan-fedor added 2 commits July 29, 2022 07:16

Removing haystack/json-schemas/haystack-pipeline-1.16.schema.json

e8ded68

This was generated by the JSON generator, but based on @ZanSara's instructions, I am removing it.

Making changes based on @ZanSara's request - the newly requested test…

5a4af7d

… is failing

Fixing the JSON schema generation again

0bbec84

Renaming replicas and moving it under serve_deployment_kwargs

37d9b4d

add extras validation, untested

54de278

ZanSara added the breaking change label Jul 29, 2022

Dcoumentation update

96cc688

zoltan-fedor added 2 commits July 29, 2022 08:36

Merge branch 'feature-ray-serve-deployment-args' of github.com:zoltan…

459f47a

…-fedor/haystack into feature-ray-serve-deployment-args

Black

d50ed96

zoltan-fedor requested a review from ZanSara July 29, 2022 16:23

[EMPTY] Re-trigger CI

1347962

ZanSara approved these changes Aug 3, 2022

View reviewed changes

ZanSara merged commit 7b97bbb into deepset-ai:master Aug 3, 2022

zoltan-fedor deleted the feature-ray-serve-deployment-args branch August 3, 2022 16:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extending the Ray Serve integration to allow attributes for Serve deployments #2918

Extending the Ray Serve integration to allow attributes for Serve deployments #2918

zoltan-fedor commented Jul 28, 2022 •

edited by ZanSara

Loading

ZanSara commented Jul 29, 2022

ZanSara left a comment •

edited

Loading

zoltan-fedor commented Jul 29, 2022 •

edited

Loading

ZanSara commented Jul 29, 2022

ZanSara commented Jul 29, 2022

zoltan-fedor commented Jul 29, 2022

zoltan-fedor commented Jul 29, 2022

ZanSara commented Jul 29, 2022

ZanSara left a comment

	"oneOf": [
	{
	"not": {"required": ["extras"]},
	"properties": {
	"pipelines": {
	"title": "Pipelines",
	"items": {"properties": {"nodes": {"items": {"not": {"required": ["replicas"]}}}}},
	}
	},
	},

Extending the Ray Serve integration to allow attributes for Serve deployments #2918

Extending the Ray Serve integration to allow attributes for Serve deployments #2918

Conversation

zoltan-fedor commented Jul 28, 2022 • edited by ZanSara Loading

Pre-flight checklist

ZanSara commented Jul 29, 2022

ZanSara left a comment • edited Loading

Choose a reason for hiding this comment

zoltan-fedor commented Jul 29, 2022 • edited Loading

ZanSara commented Jul 29, 2022

ZanSara commented Jul 29, 2022

zoltan-fedor commented Jul 29, 2022

zoltan-fedor commented Jul 29, 2022

ZanSara commented Jul 29, 2022

ZanSara left a comment

Choose a reason for hiding this comment

zoltan-fedor commented Jul 28, 2022 •

edited by ZanSara

Loading

ZanSara left a comment •

edited

Loading

zoltan-fedor commented Jul 29, 2022 •

edited

Loading