Solutions for the Distributed System Challenges from Fly.io and Kyle Kingsbury
Hello world! Just echoes back what you send it.
Generates a unique ID for each node. Inspired by MongoDB's ObjectId.
Each ID is a concatenation of:
- Node ID: to avoid collisions with other nodes
- Current time: to avoid collisions with messages sent at the same time
- Count: to avoid collisions with messages sent at the same time and node
A basic node that receives messages, saves them to a slice, and returns on the read
message.
Same as the previous challenge, but now it broadcast the message to all neighbors.
Node spawns a new thread for each message sent to a neighbor, and exponentially backoff if the neighbor doesn't respond.
This solution is pretty dumb and not the most efficient one, here's other ideas of how it could be done:
- Use a map to keep track of which messages are not yet acknowledged and retry them on the next messages
- Use a separated thread with a ticker to periodically distribute the messages to neighbors
The node receiving the message broadcasts to all the other nodes in network with a send-and-forget approach with the gossip
message type.
Results from a run:
All checks passed
Messages per op: 11.789474/30
Median latency: 84/400
Maximum latency: 105/600
The previous solution also works for this challenge, but I've taken a step further and tried to make it send as less messages per operation as possible.
It's the same idea as #3d, but it spawns a new thread that broadcasts the messages every 1.5s with a buffered channel and only broadcasts if there are new messages.
Results:
All checks passed
Messages per op: 4.728111/20
Median latency: 791/1000
Maximum latency: 1584/2000
I've made 2 solutions for this challenge, one that uses a OT approach and another with a CRDT solution.
OT (Operational Transformation) solution uses the Sequential KV store from Maelstrom and tries to Compare and Swap the value of the counter.
CRDT (Conflict-free Replicated Data Type) solution uses a map to store the counter of the other nodes and propagates its own when receives an add
message.
Links:
- CRDTs: The Hard Parts by Martin Kleppmann - Great explanation about the differences between OTs and CRDTs
- Conflict-free Replicated Data Types on Wikipedia
Simple implementation of a Kafka-style log. Store messages in a map, and return the messages after a given offset.
Implementation using the Lin-KV store:
- Compare and Swap to get the next offset for a key
- Store the each message in a separated key
- Last Write Wins to store the committed offsets
The previous solution also works for this challenge, so I tried to remove the KV completely and use a leader-follower approach.
The leader node is fixed to n0
(could be done with a consensus algorithm).
The leader handles the offsets to avoid conflicts. All the nodes exchange messages to get updated messages and committed offsets.
Note: since the txn-rw-register
workload doesn't really check any anomalies, and it passes even returning 0 for all keys. I've tried to follow the definitions from Jepsen. Thanks to teivah and mchernyakov solutions, I could check if I was on the right track.
A simple, totally-available transactional store, keeps a single map and lock for each read/write operation.
This one took me lots of back-and-forth because I wasn't sure of how to implement the mutexes to get the right consistency model. Even through the test specify read uncommitted, this doesn't abort G0/P0 anomalies.
Links:
This solution uses a g-counter to keep track of the transactions, ensuring that G0/P0 anomalies won't happen and to keep the last possible state of the key, and replicates each write operation to all the other nodes.
In case of network partitions, the key is replicated until it's acknowledged by the node.
Links:
- Read Uncommitted from Jepsen docs
- A Critique of ANSI SQL Isolation Levels By Berenson, Microsoft
The solution follows the same idea as the previous one, with g-counters, but it replicates entire transactions instead of individual writes, and first writes to a separated store to then merge changes into the main store.
Like the previous solution, the txn is replicated until it's acknowledged by the node. On replication, it also merges the changes using a separated store.
Links:
- Read Committed from Jepsen docs
This repo is a Go workspace where each solution is in a separate module.
Each solution is in a folder with the main.go
and main_test.go
for tests.
All solutions have automated tests using custom utils as a way to verify and play with the solutions, and to learn how to write tests in Go.
This project uses Nix for packages and Taskfile to run and build the solutions.
To run the solutions, use the run-*
tasks:
task run-all
task run-uniqueids
task run-broadcast-efficient-ii1 # Will run and check the result output
To run the tests, use the test-*
tasks:
task test-all
task test-uniqueids
task test-broadcast-efficient-ii