[Bifrost] Support append_batch #1536

AhmedSoliman · 2024-05-21T11:21:30Z

[Bifrost] Support append_batch

This introduces a new append_batch API to bifrost that accepts a batch of payloads. In addition, we make use of this API to handle batches of action effects when consuming the effect stream. We attempt to batch up to 10 items (hardcoded at the moment) before sending them to bifrost.

Impact: For small invocations, we now average ~4 records (+1 for metadata) per batch since invoker emits input/get_state/output etc. in bursts. This reduces the overhead of handling the happy path of small handlers.

This also address a small issue where the the initial offsets leave an unnecessary gap on re-instantiation.

Stack created with Sapling. Best reviewed with ReviewStack.

github-actions · 2024-05-21T11:47:41Z

Test Results

99 files ±0 99 suites ±0 8m 11s ⏱️ -9s
83 tests ±0 82 ✅ ±0 1 💤 ±0 0 ❌ ±0
212 runs ±0 209 ✅ ±0 3 💤 ±0 0 ❌ ±0

Results for commit e5b89d4. ± Comparison against base commit e8b09c3.

♻️ This comment has been updated with latest results.

tillrohrmann

Thanks for creating this improvement @AhmedSoliman. The changes look good to me. The one question I am asking myself is why do we need to push the concern of batching to the higher levels? The LogStoreWriter also does some batching but I assume it is not sufficient? Do you know why this is the case? Could it be possible to use something like ready_chunks in the LogStoreWriter as well (e.g. we extend the WriteBatch as long as we still have ready LogStoreWriteCommands or reach a maximum number)? I guess part of the answer is that we need to do batching at various levels to be efficient but it somehow feels as if the LogStoreWriter's batching mechanism should be able to handle a bursty invoker (at least not result into n writes for a burst of n messages).

crates/worker/src/partition/mod.rs

tillrohrmann · 2024-05-21T17:14:13Z

crates/bifrost/src/loglets/local_loglet/log_store_writer.rs

@@ -125,7 +131,7 @@ impl LogStoreWriter {
                .expect("metadata cf exists");

            for command in commands {
-                if let Some(data_command) = command.data_update {
+                for data_command in command.data_updates {


Why can't the LogStoreWriter handle the batching of bursty records coming from the invoker?

To respect append_batch contract, we want to either commit all records or not. The interface doesn't export partial failures.

AhmedSoliman · 2024-05-22T08:54:04Z

@tillrohrmann Good question. One thing I want to assert is that we don't want to batch records in a single record, that is even if a batch was written from higher levels, it should still translate to an LSN per record, that's the invariant I want to withhold.

Now, why does exposing a batch API make sense? Because I want to optimize the latency for low-throughput usage patterns (when buffering on logwriter is disabled) but if PP already has a bunch of records already ready, bifrost has no way to know if it should wait for more or not, it'll always flush the write batch as soon as possible (maybe it'd be easier to hide if there was send_many in tokio's mpsc).

Why do I try to read the ready actuator actions in batches? Even if we don't have a batching API in bifrost, it's possible that the channel have a number of records that should be sent to bifrost, but select!'s scheduling switches from this branch to another frequently introducing large idle_duration between bifrost writes. So, even if bifrost buffers internally, the writer would timeout and write smaller batches, and the overall latency from an ingress request POV is much higher.

It's a little tricky to explain all the intricate details in a brief description, but happy to walk you through offline if you like.

This introduces a new append_batch API to bifrost that accepts a batch of payloads. In addition, we make use of this API to handle batches of action effects when consuming the effect stream. We attempt to batch up to 10 items (hardcoded at the moment) before sending them to bifrost. Impact: For small invocations, we now average ~4 records (+1 for metadata) per batch since invoker emits input/get_state/output etc. in bursts. This reduces the overhead of handling the happy path of small handlers. This also address a small issue where the the initial offsets leave an unnecessary gap on re-instantiation.

AhmedSoliman changed the title ~~[Bifrost] Support append batches~~ [Bifrost] Support append_batch May 21, 2024

AhmedSoliman force-pushed the pr1536 branch from e1e04f8 to f82b912 Compare May 21, 2024 11:25

AhmedSoliman force-pushed the pr1536 branch 2 times, most recently from 0537178 to 2a7bf6c Compare May 21, 2024 11:57

AhmedSoliman marked this pull request as ready for review May 21, 2024 11:57

AhmedSoliman requested a review from tillrohrmann May 21, 2024 12:32

tillrohrmann approved these changes May 21, 2024

View reviewed changes

AhmedSoliman force-pushed the pr1536 branch from 2a7bf6c to e5b89d4 Compare May 22, 2024 12:52

This was referenced May 22, 2024

Introduce bifrost benchpress #1545

Merged

Record nanos instead of millis for bifrost record creation time #1544

Merged

Default to low-latency local loglet configuration #1546

Merged

AhmedSoliman force-pushed the pr1536 branch from e5b89d4 to 2b05f4e Compare May 22, 2024 15:32

AhmedSoliman merged commit 2b05f4e into main May 22, 2024
11 checks passed

AhmedSoliman deleted the pr1536 branch May 22, 2024 15:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bifrost] Support append_batch #1536

[Bifrost] Support append_batch #1536

AhmedSoliman commented May 21, 2024 •

edited

Loading

github-actions bot commented May 21, 2024 •

edited

Loading

tillrohrmann left a comment •

edited

Loading

tillrohrmann May 21, 2024

AhmedSoliman May 22, 2024

AhmedSoliman commented May 22, 2024

[Bifrost] Support append_batch #1536

[Bifrost] Support append_batch #1536

Conversation

AhmedSoliman commented May 21, 2024 • edited Loading

github-actions bot commented May 21, 2024 • edited Loading

Test Results

tillrohrmann left a comment • edited Loading

Choose a reason for hiding this comment

tillrohrmann May 21, 2024

Choose a reason for hiding this comment

AhmedSoliman May 22, 2024

Choose a reason for hiding this comment

AhmedSoliman commented May 22, 2024

AhmedSoliman commented May 21, 2024 •

edited

Loading

github-actions bot commented May 21, 2024 •

edited

Loading

tillrohrmann left a comment •

edited

Loading