-
Notifications
You must be signed in to change notification settings - Fork 900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Print a deterministic length of commit hash in --version #6258
Conversation
.and_then(|r| String::from_utf8(r.stdout).ok()) | ||
.ok()?; | ||
let mut stdout = String::from_utf8(output.stdout).ok()?; | ||
stdout.truncate(10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we're truncating to 10?
I checked the hash length for some other tools and they're either 8-9 characters long:
rustc 1.81.0-nightly (6b0f4b5ec 2024-06-24)
cargo 1.81.0-nightly (bc89bffa5 2024-06-22)
clippy 0.1.81 (6b0f4b5e 2024-06-24)
rustup 1.27.1 (54dd3d00f 2024-04-24)
rust-analyzer 1.81.0-nightly (6b0f4b5e 2024-06-24)
Also, I think some of the non-determinism came from using the --short
option with git's rev-parse
subcommand. Here are the docs:
--short[=length]
Same as --verify but shortens the object name to a unique prefix with at least length
characters. The minimum length is 4, the default is the effective value of the
core.abbrev configuration variable (see git-config(1))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rust-lang/rust repo (which rustfmt releases are published from, and therefore the one whose commit hashes matter) currently has about 262k commits on master and 398k commits reachable from PR branches. Using 800k in the estimations to account for some growth, the formula for probability of any colliding H-digit hash over N years is:
where 429 is the estimated number of releases per year, including 365 nightlies, 8 stables, and 7 betas per stable release. For 8-digit hash, there is 33% probability of a collision in 5 years. For 9-digit, 2.5%, and 10-digit, 0.16%.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the detailed and thoughtful explanation!
@dtolnay if you're able to rebase this PR to bring it up to date I can get it merged. |
Previously,
rustfmt --version
would print nondeterministic output when built from the exact same source code and git commit. The number of hex digits printed in the commit hash would vary based on how many other branches and tags you had fetched so far, what other commits you had been working on in other branches, how recently you had rungit gc
, platform-specific variation in git's default configuration, and platform differences in the sequence of steps performed by the release pipeline.You can see this in the official build of
rustfmt
that is part of the Rust 1.80.0 stable release.The rustfmt for x86_64-unknown-linux-gnu prints:
Whereas for aarch64-apple-darwin it prints: