-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bifrost] Support append_batch #1536
Conversation
0537178
to
2a7bf6c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for creating this improvement @AhmedSoliman. The changes look good to me. The one question I am asking myself is why do we need to push the concern of batching to the higher levels? The LogStoreWriter
also does some batching but I assume it is not sufficient? Do you know why this is the case? Could it be possible to use something like ready_chunks
in the LogStoreWriter
as well (e.g. we extend the WriteBatch
as long as we still have ready LogStoreWriteCommands
or reach a maximum number)? I guess part of the answer is that we need to do batching at various levels to be efficient but it somehow feels as if the LogStoreWriter's
batching mechanism should be able to handle a bursty invoker (at least not result into n writes for a burst of n messages).
@@ -125,7 +131,7 @@ impl LogStoreWriter { | |||
.expect("metadata cf exists"); | |||
|
|||
for command in commands { | |||
if let Some(data_command) = command.data_update { | |||
for data_command in command.data_updates { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't the LogStoreWriter
handle the batching of bursty records coming from the invoker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To respect append_batch contract, we want to either commit all records or not. The interface doesn't export partial failures.
@tillrohrmann Good question. One thing I want to assert is that we don't want to batch records in a single record, that is even if a batch was written from higher levels, it should still translate to an LSN per record, that's the invariant I want to withhold. Now, why does exposing a batch API make sense? Because I want to optimize the latency for low-throughput usage patterns (when buffering on logwriter is disabled) but if PP already has a bunch of records already ready, bifrost has no way to know if it should wait for more or not, it'll always flush the write batch as soon as possible (maybe it'd be easier to hide if there was send_many in tokio's mpsc). Why do I try to read the ready actuator actions in batches? Even if we don't have a batching API in bifrost, it's possible that the channel have a number of records that should be sent to bifrost, but It's a little tricky to explain all the intricate details in a brief description, but happy to walk you through offline if you like. |
This introduces a new append_batch API to bifrost that accepts a batch of payloads. In addition, we make use of this API to handle batches of action effects when consuming the effect stream. We attempt to batch up to 10 items (hardcoded at the moment) before sending them to bifrost. Impact: For small invocations, we now average ~4 records (+1 for metadata) per batch since invoker emits input/get_state/output etc. in bursts. This reduces the overhead of handling the happy path of small handlers. This also address a small issue where the the initial offsets leave an unnecessary gap on re-instantiation.
[Bifrost] Support append_batch
This introduces a new append_batch API to bifrost that accepts a batch of payloads. In addition, we make use of this API to handle batches of action effects when consuming the effect stream. We attempt to batch up to 10 items (hardcoded at the moment) before sending them to bifrost.
Impact: For small invocations, we now average ~4 records (+1 for metadata) per batch since invoker emits input/get_state/output etc. in bursts. This reduces the overhead of handling the happy path of small handlers.
This also address a small issue where the the initial offsets leave an unnecessary gap on re-instantiation.
Stack created with Sapling. Best reviewed with ReviewStack.