Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best options for resolving the address of an activated/activating Grain? #9353

Open
d-jagoda opened this issue Feb 16, 2025 · 5 comments
Open

Comments

@d-jagoda
Copy link
Contributor

Hi,

I have a messaging bridge that ingests message requests to grains and in response the grains could be doing heavy work. Having too many concurrent grain calls such as these can ultimately starve the system of resources (CPU). I have tested a feedback controller that adjusts concurrency based on input rate, throughput and CPU utilization during message ingestion and it works quite well as long as the controller is Silo aware. There is one controller per target silo, and I was relying on pre-emptive grain placement (Sending a message to the grain via a grain extension) to track where the current activation might be - unfortunately this is problematic because activating a grain may induce a resourcing bottleneck due to some code in OnActivateAsync method (for example initializing data from a REST API that doesn't scale well).

Given the following constraints:

  1. Weak consistency is fine as long as the lag is small and large disruptive events that can invalidate grain addressed can be known (IClusterMembershipService for cluster changes, any other such disruptive events?)
  2. I don't want to add an external grain directory or write a custom one

What are the best options for:

  1. Resolving the grain address of an active grain (apart from IManagementGrain.GetActivationAddress)?
  2. Resolve the potential address of an inactive grain (would it be bad to resolve and use the placement directors?)?

Many thanks,
DJ.

@scalalang2
Copy link
Contributor

I'm just asking out of curiousity, but why don't you use a placement strategy based on the power-of-two choices?

@d-jagoda
Copy link
Contributor Author

I'm just asking out of curiousity, but why don't you use a placement strategy based on the power-of-two choices?

Placement isn't the problem. The problem is concurrent execution of grain calls. Lets say Some grain is performing a CPU intensive calculation that takes 200ms and if I contentiously send many such requests without managing flow control, the application will be overwhelmed and grain calls will timeout. Its the same with slow IO work, where a grain makes an IO call to a REST API or a database that doesn't scale well. The resource that cannot scale will eventually become overwhelmed and grain calls will timeout. If it was CPU, the application responsiveness will drop and will eventually be killed by a monitoring applications - a kubernetes liveness probe for example.

@scalalang2
Copy link
Contributor

I do not have full understanding of what problem you're solving now.
So what I'm saying now is may not fully applicable for you.

Here are several things I have thought.

  • You can monitor incoming calls to grains by using IncomingGrainCallFilter. If you implement a logic here to throttle subsequent requests when traffic overflows, it seems the desired goal can be achieved.

  • Use AlwaysInterleave attribute.
    Requests that are sent to single grain is queued by default, which means it sequentially handles your request.
    You can scale operations by attaching AlwaysInterleave attribute If the grain works by stateless

  • If the Grain itself is very very hot data, Orleans might not have been the best choice to begin with. Orleans is suitable for scenarios where you need to perform relatively small computations but with an extremely large number of actors.

@ledjon-behluli
Copy link
Contributor

@d-jagoda
Copy link
Contributor Author

@ledjon-behluli thanks for the suggestions. It is the right type of idea, however in my case, rate limiting will be based on the cluster's ability to process requests without degrading it's performance. The limits would self tuning - much like the threadpool, and will respond to elasticity of the cluster (when auto scaling helps). In the simpler scenarios having access to the grain directory in order to resolve the Silo address of a grain would help. In more complex scenarios - where one grain calls another grain, I would have to profile the request path to access the full impact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants