-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Katana: Add state update DA encodings #2474
Conversation
WalkthroughOhayo, sensei! This pull request introduces several new files and modules in the Changes
Possibly related PRs
Suggested labels
TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
Outside diff range and nitpick comments (11)
crates/katana/primitives/src/da/mod.rs (1)
1-5
: Ohayo, sensei! The module structure looks sharp!The organization of these modules seems well-thought-out, covering various aspects of data availability and cryptographic operations. It's a clean way to separate concerns and maintain modularity.
To enhance code clarity, consider adding documentation comments for each module. This will help other developers understand the purpose and contents of each module at a glance. For example:
/// Module for handling binary large objects (blobs) pub mod blob; /// Implementation of EIP-4844 related functionality pub mod eip4844; /// Utilities for encoding and decoding data pub mod encoding; /// Mathematical operations and utilities pub mod math; /// Serialization and deserialization functionality pub mod serde;What do you think, sensei? Would you like me to open a GitHub issue to track this documentation task?
crates/katana/primitives/src/da/serde.rs (1)
6-11
: Ohayo, sensei! The function signature looks good, but let's enhance the documentation!The function name and signature are clear and appropriate. However, we could improve the documentation to provide more context and details:
Consider expanding the documentation as follows:
/// Parse a hexadecimal string into a vector of `BigUint` representing blob data. /// /// This function expects a hexadecimal string of length 64 * BLOB_LEN characters. /// It splits the string into BLOB_LEN segments of 64 characters each and converts /// each segment into a `BigUint`. /// /// # Arguments /// * `data` - A hexadecimal string representing the blob data. /// /// # Returns /// A vector of `BigUint` values, where each `BigUint` represents a 32-byte word of the blob. /// /// # Panics /// This function will panic if the input string contains invalid hexadecimal characters /// or if its length is not exactly 64 * BLOB_LEN.crates/katana/primitives/Cargo.toml (1)
14-14
: Ohayo, sensei! New dependencies look good, but let's tweak num-bigint.The addition of
num-traits
as a workspace dependency is great! However, fornum-bigint
, consider using workspace versioning for consistency:-num-bigint = "0.4.6" +num-bigint.workspace = trueThis change will make it easier to manage versions across the workspace, sensei!
Also applies to: 27-27
crates/katana/primitives/src/da/eip4844.rs (2)
6-10
: Ohayo, sensei! The constant looks good, but let's add a bit more flavor to the comment!The
BLOB_LEN
constant is correctly defined and aligns with EIP-4844 specifications. Great job! However, we could make the comment more informative.Consider expanding the comment to provide more context:
-/// Length of the blob. +/// Length of the blob in bytes, as specified in EIP-4844. +/// This constant defines the size of each data blob used in the data availability layer. pub const BLOB_LEN: usize = 4096;
12-26
: Ohayo, sensei! The lazy_static definitions are mostly on point, but let's sharpen that katana a bit more!The
BLS_MODULUS
andGENERATOR
constants are correctly defined and align with EIP-4844 specifications. Excellent use oflazy_static
for these large values!However, the
TWO
constant seems a bit unnecessary as alazy_static
definition.Consider replacing the
TWO
constant with a simple integer or using theconst
keyword instead:- pub static ref TWO: BigUint = 2u32.to_biguint().unwrap(); + pub const TWO: u32 = 2;If you need
TWO
as aBigUint
in multiple places, you could create a function to convert it on demand:pub fn two_as_biguint() -> BigUint { 2u32.to_biguint().unwrap() }This approach would be more efficient and clearer in its intent.
crates/katana/primitives/src/da/math.rs (4)
35-38
: Nitpick: Avoid unnecessary cloning ofTWO
Ohayo, sensei! You can improve performance by referencing
TWO
directly instead of cloning it.Apply this change:
- res0.push(div_mod(a + b, TWO.clone(), p)); + res0.push(div_mod(a + b, &TWO, p)); - res1.push(div_mod(diff, TWO.clone() * x, p)); + res1.push(div_mod(diff, &TWO * x, p));
40-40
: Nitpick: Use&TWO
to avoid cloningOhayo, sensei! Similarly, you can avoid cloning
TWO
here by using a reference.Apply this change:
- new_xs.push(x.modpow(&TWO.clone(), p)); + new_xs.push(x.modpow(&TWO, p));
68-68
: Nitpick: Prevent unnecessary cloning indiv_mod
Ohayo, sensei! You can avoid cloning
TWO
in thediv_mod
function as well.Apply this change:
- a * b.modpow(&(p - TWO.clone()), p) % p + a * b.modpow(&(p - &TWO), p) % p
19-69
: Offer Assistance: Add unit tests for mathematical functionsOhayo, sensei! To ensure the correctness of
ifft
anddiv_mod
, it's beneficial to include unit tests covering various cases, including edge conditions.Would you like me to help draft some unit tests for these functions?
crates/katana/primitives/src/da/encoding.rs (2)
261-283
: Remove commented-out code to improve code cleanlinessOhayo, sensei! The
ContractUpdate::decode
function is commented out (lines 261-283). If this code is no longer needed, consider removing it to keep the codebase clean and maintainable. Alternatively, if it's a work in progress, you might want to implement it fully or add a TODO comment.
190-197
: Use proper Rust doc comments for documentationOhayo, sensei! The documentation for the
Metadata
struct's encoding format (lines 190-197) uses//
comments. To generate documentation properly and follow Rust conventions, please use///
for struct-level documentation or//!
for module-level documentation.Suggested change:
/// Metadata information about the contract update. -// Encoding format: -// -// ┌───────────────┬───────────────┬───────────────┬───────────────────────────┐ -// │ padding │ class flag │ new nonce │ no. storage updates │ -// ├───────────────┼───────────────┼───────────────┼───────────────────────────┤ -// │ 127 bits │ 1 bit │ 64 bits │ 64 bits │ -// └───────────────┴───────────────┴───────────────┴───────────────────────────┘ + /// Encoding format: + /// + /// ┌───────────────┬───────────────┬───────────────┬───────────────────────────┐ + /// │ padding │ class flag │ new nonce │ no. storage updates │ + /// ├───────────────┼───────────────┼───────────────┼───────────────────────────┤ + /// │ 127 bits │ 1 bit │ 64 bits │ 64 bits │ + /// └───────────────┴───────────────┴───────────────┴───────────────────────────┘
Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Files ignored due to path filters (1)
Cargo.lock
is excluded by!**/*.lock
Files selected for processing (9)
- crates/katana/primitives/Cargo.toml (2 hunks)
- crates/katana/primitives/src/da/blob.rs (1 hunks)
- crates/katana/primitives/src/da/eip4844.rs (1 hunks)
- crates/katana/primitives/src/da/encoding.rs (1 hunks)
- crates/katana/primitives/src/da/math.rs (1 hunks)
- crates/katana/primitives/src/da/mod.rs (1 hunks)
- crates/katana/primitives/src/da/serde.rs (1 hunks)
- crates/katana/primitives/src/lib.rs (1 hunks)
- crates/katana/primitives/tests/blobs.rs (1 hunks)
Additional comments not posted (6)
crates/katana/primitives/src/lib.rs (1)
7-7
: Ohayo, sensei! Welcome to the newda
module!The addition of the
da
module looks great and follows the existing structure of the codebase. It's placed in the correct alphabetical order among the other module declarations.crates/katana/primitives/src/da/serde.rs (1)
1-4
: Ohayo, sensei! The imports look good!The necessary types and constants are imported correctly for the function's implementation.
crates/katana/primitives/tests/blobs.rs (1)
1-6
: Ohayo, sensei! The imports look good!The necessary modules and types are imported correctly, providing all the required functionality for blob parsing and encoding tests.
crates/katana/primitives/Cargo.toml (1)
31-31
: Ohayo again, sensei! Dev-dependencies look sharp!The changes to the dev-dependencies section are on point:
- Removing
num-traits
is correct as it's now a regular dependency.- Adding
rstest
as a workspace dependency is consistent with the project structure.These changes will keep our testing setup clean and maintainable. Excellent work, sensei!
crates/katana/primitives/src/da/eip4844.rs (1)
1-4
: Ohayo, sensei! The imports look sharp and focused!The necessary imports for
FromStr
,lazy_static
, andBigUint
are all present and accounted for. No unnecessary imports detected. Excellent work!crates/katana/primitives/src/da/blob.rs (1)
7-18
: Ohayo, sensei! Great job on the documentationThe function is well-documented with clear explanations of its purpose, arguments, and return values.
#[rstest] | ||
#[case("./tests/test-data/blobs/block_636262.txt")] | ||
#[case("./tests/test-data/blobs/block_636263.txt")] | ||
#[case("./tests/test-data/blobs/block_636264.txt")] | ||
fn parse_blobs_rt(#[case] blob: &str) -> Result<()> { | ||
let encoded = blob::recover(read(blob)); | ||
let state_update = encoding::decode_state_updates(&encoded)?; | ||
let _ = encode_state_updates(state_update); | ||
Ok(()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Ohayo, sensei! Let's enhance the parse_blobs_rt
test function.
The function structure looks good, but we can improve it to make it a more effective test:
- Add assertions to verify the correctness of the operations.
- Consider comparing the re-encoded state updates with the original encoded data.
Here's a suggested improvement:
#[rstest]
#[case("./tests/test-data/blobs/block_636262.txt")]
#[case("./tests/test-data/blobs/block_636263.txt")]
#[case("./tests/test-data/blobs/block_636264.txt")]
fn parse_blobs_rt(#[case] blob: &str) -> Result<()> {
let original_encoded = blob::recover(read(blob)?);
let state_update = encoding::decode_state_updates(&original_encoded)?;
let re_encoded = encode_state_updates(state_update.clone())?;
assert_eq!(original_encoded, re_encoded, "Re-encoded data should match the original");
assert!(!state_update.is_empty(), "Decoded state update should not be empty");
Ok(())
}
These changes will make the test more robust by verifying that the encoding and decoding processes are reversible and that the state update is not empty.
return arr; | ||
} | ||
|
||
let n = arr.len() / 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue: Ensure the input length is a power of two
Ohayo, sensei! The function assumes that arr.len()
is divisible by 2. If arr.len()
is odd, the calculation let n = arr.len() / 2;
may lead to incorrect behavior. Please add a check to confirm that the length of arr
is a power of two.
Consider adding this validation:
+ if arr.len() % 2 != 0 {
+ panic!("Input array length must be a power of two");
+ }
Committable suggestion was skipped due to low confidence.
let a = &arr[i]; | ||
let b = &arr[i + 1]; | ||
let x = &xs[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Missing Length Check for xs
in ifft
Calls
Ohayo, sensei! It appears there are no validations ensuring that xs.len()
is at least as long as arr.len()
before invoking ifft
. This could lead to potential out-of-bounds errors. Please add the necessary length checks to ensure safe usage.
Analysis chain
Verify: Check that xs
has sufficient length
Ohayo, sensei! The code uses xs[i]
in the loop, but there's no check to ensure xs
is long enough. Please verify that xs.len()
is at least as long as arr.len()
.
Run the following script to confirm:
Ohayo, sensei! Let's ensure that every ifft
call is safe by confirming that xs
is indeed long enough. Running the above script will help us verify that xs.len()
is checked against arr.len()
wherever ifft
is invoked. Once you have the results, feel free to share them for the next steps!
Ohayo, sensei! The previous check didn’t yield any results. Let’s run this new script to comprehensively verify that xs
is sufficiently long before each ifft
call. Share the results once it's executed, and we'll proceed from there!
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Ensure that 'xs' vectors provided to 'ifft' have appropriate lengths.
# Test: Search for all calls to 'ifft' and compare the lengths of 'arr' and 'xs'.
rg --type rust 'ifft\((?P<arr>[^,]+),\s*(?P<xs>[^,]+),' -A1
Length of output: 623
Script:
#!/bin/bash
# Description: Verify that 'xs.len()' is at least as long as 'arr.len()' before 'ifft' calls.
# Search for 'ifft(' and include 5 lines of context above each match
rg --type rust -C5 'ifft\(' | \
# Filter the results to include only those where 'xs.len() >= arr.len()' exists in the context
grep -B5 'xs\.len\(\)\s*>=\s*arr\.len\(\)'
Length of output: 74
Script:
#!/bin/bash
# Description: Verify that 'xs.len()' is at least as long as 'arr.len()' before 'ifft' calls with broader search patterns.
# Search for 'ifft(' and include 5 lines of context above each match
rg --type rust -C5 'ifft\(' | \
# Look for any comparison between 'xs.len()' and 'arr.len()' in the context
grep -E 'xs\.len\(\)\s*(>=|>|==)\s*arr\.len\(\)|arr\.len\(\)\s*(<=|<|==)\s*xs\.len\(\)'
Length of output: 119
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2474 +/- ##
==========================================
+ Coverage 68.45% 68.61% +0.15%
==========================================
Files 368 372 +4
Lines 48185 48426 +241
==========================================
+ Hits 32987 33226 +239
- Misses 15198 15200 +2 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Files ignored due to path filters (1)
Cargo.lock
is excluded by!**/*.lock
Files selected for processing (9)
- crates/katana/primitives/Cargo.toml (2 hunks)
- crates/katana/primitives/src/da/blob.rs (1 hunks)
- crates/katana/primitives/src/da/eip4844.rs (1 hunks)
- crates/katana/primitives/src/da/encoding.rs (1 hunks)
- crates/katana/primitives/src/da/math.rs (1 hunks)
- crates/katana/primitives/src/da/mod.rs (1 hunks)
- crates/katana/primitives/src/da/serde.rs (1 hunks)
- crates/katana/primitives/src/lib.rs (1 hunks)
- crates/katana/primitives/tests/blobs.rs (1 hunks)
Files skipped from review as they are similar to previous changes (8)
- crates/katana/primitives/Cargo.toml
- crates/katana/primitives/src/da/eip4844.rs
- crates/katana/primitives/src/da/encoding.rs
- crates/katana/primitives/src/da/math.rs
- crates/katana/primitives/src/da/mod.rs
- crates/katana/primitives/src/da/serde.rs
- crates/katana/primitives/src/lib.rs
- crates/katana/primitives/tests/blobs.rs
Additional comments not posted (1)
crates/katana/primitives/src/da/blob.rs (1)
1-5
: LGTM! Imports are well-organized.Ohayo, sensei! The import statements are concise and relevant to the functionality implemented in this file. Good job on keeping the imports clean and organized.
pub fn transform(data: Vec<BigUint>) -> Vec<BigUint> { | ||
let xs: Vec<BigUint> = (0..BLOB_LEN) | ||
.map(|i| { | ||
let bin = format!("{:012b}", i); | ||
let bin_rev = bin.chars().rev().collect::<String>(); | ||
GENERATOR.modpow(&BigUint::from_str_radix(&bin_rev, 2).unwrap(), &BLS_MODULUS) | ||
}) | ||
.collect(); | ||
|
||
fft(data, xs, &BLS_MODULUS) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve robustness and reduce code duplication
Ohayo, sensei! The transform
function looks good, but we can make it more robust and reduce code duplication:
- Add input validation and handle potential panic (similar to
recover
):
pub fn transform(data: Vec<BigUint>) -> Vec<BigUint> {
+ if data.len() != BLOB_LEN {
+ panic!("Invalid data length: expected {}, got {}", BLOB_LEN, data.len());
+ }
let xs: Vec<BigUint> = (0..BLOB_LEN)
.map(|i| {
let bin = format!("{:012b}", i);
let bin_rev = bin.chars().rev().collect::<String>();
- GENERATOR.modpow(&BigUint::from_str_radix(&bin_rev, 2).unwrap(), &BLS_MODULUS)
+ let exponent = BigUint::from_str_radix(&bin_rev, 2)
+ .expect("Failed to parse reversed binary string into BigUint");
+ GENERATOR.modpow(&exponent, &BLS_MODULUS)
})
.collect();
fft(data, xs, &BLS_MODULUS)
}
- To reduce code duplication, consider extracting the common logic into a separate function:
fn generate_xs() -> Vec<BigUint> {
(0..BLOB_LEN)
.map(|i| {
let bin = format!("{:012b}", i);
let bin_rev = bin.chars().rev().collect::<String>();
let exponent = BigUint::from_str_radix(&bin_rev, 2)
.expect("Failed to parse reversed binary string into BigUint");
GENERATOR.modpow(&exponent, &BLS_MODULUS)
})
.collect()
}
pub fn recover(data: Vec<BigUint>) -> Vec<BigUint> {
if data.len() != BLOB_LEN {
panic!("Invalid data length: expected {}, got {}", BLOB_LEN, data.len());
}
let xs = generate_xs();
ifft(data, xs, &BLS_MODULUS)
}
pub fn transform(data: Vec<BigUint>) -> Vec<BigUint> {
if data.len() != BLOB_LEN {
panic!("Invalid data length: expected {}, got {}", BLOB_LEN, data.len());
}
let xs = generate_xs();
fft(data, xs, &BLS_MODULUS)
}
These changes will improve the robustness of both functions and make the code more maintainable.
pub fn recover(data: Vec<BigUint>) -> Vec<BigUint> { | ||
let xs: Vec<BigUint> = (0..BLOB_LEN) | ||
.map(|i| { | ||
let bin = format!("{:012b}", i); | ||
let bin_rev = bin.chars().rev().collect::<String>(); | ||
GENERATOR.modpow(&BigUint::from_str_radix(&bin_rev, 2).unwrap(), &BLS_MODULUS) | ||
}) | ||
.collect(); | ||
|
||
ifft(data, xs, &BLS_MODULUS) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add input validation and handle potential panic
Ohayo, sensei! The recover
function looks good overall, but I have two suggestions to improve its robustness:
- Add input validation to ensure the correct data length:
pub fn recover(data: Vec<BigUint>) -> Vec<BigUint> {
+ if data.len() != BLOB_LEN {
+ panic!("Invalid data length: expected {}, got {}", BLOB_LEN, data.len());
+ }
let xs: Vec<BigUint> = (0..BLOB_LEN)
- Handle the potential panic from
unwrap()
:
.map(|i| {
let bin = format!("{:012b}", i);
let bin_rev = bin.chars().rev().collect::<String>();
- GENERATOR.modpow(&BigUint::from_str_radix(&bin_rev, 2).unwrap(), &BLS_MODULUS)
+ let exponent = BigUint::from_str_radix(&bin_rev, 2)
+ .expect("Failed to parse reversed binary string into BigUint");
+ GENERATOR.modpow(&exponent, &BLS_MODULUS)
})
These changes will make the function more robust and easier to debug if issues arise.
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
pub fn recover(data: Vec<BigUint>) -> Vec<BigUint> { | |
let xs: Vec<BigUint> = (0..BLOB_LEN) | |
.map(|i| { | |
let bin = format!("{:012b}", i); | |
let bin_rev = bin.chars().rev().collect::<String>(); | |
GENERATOR.modpow(&BigUint::from_str_radix(&bin_rev, 2).unwrap(), &BLS_MODULUS) | |
}) | |
.collect(); | |
ifft(data, xs, &BLS_MODULUS) | |
} | |
pub fn recover(data: Vec<BigUint>) -> Vec<BigUint> { | |
if data.len() != BLOB_LEN { | |
panic!("Invalid data length: expected {}, got {}", BLOB_LEN, data.len()); | |
} | |
let xs: Vec<BigUint> = (0..BLOB_LEN) | |
.map(|i| { | |
let bin = format!("{:012b}", i); | |
let bin_rev = bin.chars().rev().collect::<String>(); | |
let exponent = BigUint::from_str_radix(&bin_rev, 2) | |
.expect("Failed to parse reversed binary string into BigUint"); | |
GENERATOR.modpow(&exponent, &BLS_MODULUS) | |
}) | |
.collect(); | |
ifft(data, xs, &BLS_MODULUS) | |
} |
Summary by CodeRabbit
New Features
recover
function for data recovery from blobs.BigUint
vectors.blob
,eip4844
,encoding
,math
, andserde
.Bug Fixes
Tests