Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip uninteresting commits for blame #1743

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

cruessler
Copy link
Contributor

@cruessler cruessler commented Dec 27, 2024

This PR is a PoC. It is a first attempt at implementing custom graph traversal for blame::file. This allows us to skip commits that we know don’t contain changes to the file being blamed. The PR is rough around the edges and contains a lot of TODOs and todo!()s.

A first round of tests showed a significant reduction in the number of commits traversed for gix blame STABILITY.md:

# before
Statistics {
    commits_traversed: 9261,
    commits_to_tree: 10710,
    trees_decoded: 10710,
    trees_diffed: 37,
    blobs_diffed: 36,
}

# after
Statistics {
    commits_traversed: 3657,
    commits_to_tree: 3749,
    trees_decoded: 3749,
    trees_diffed: 37,
    blobs_diffed: 36,
}

Also, profiling gix blame STABILITY.md on my machine revealed that the command took about 20 % less time to run.

@cruessler cruessler force-pushed the skip-uninteresting-commits-for-blame branch from 50e9f97 to 20a90a2 Compare January 17, 2025 16:19
Pass blame to parent in `process_change`. `git`’s algorithm only seems
to keep the current suspect for unblamed hunks that were the direct
result of splitting an existing unblamed hunk because it matched with a
change. All other hunks appear to be blamed on the parent without
further checks.
@cruessler cruessler force-pushed the skip-uninteresting-commits-for-blame branch from 20a90a2 to efa0057 Compare January 21, 2025 08:25
@cruessler
Copy link
Contributor Author

I think (and hope :-)) that this PR is now ready for the first round of reviews. There’s still two or three todo!() left, but I’ve addressed all of the TODOs as far as I can see and I’m really satisfied with the numbers so far as there seems to be a considerable speedup.

Also, the traversal is now much closer to what git does, and I hope that it most of the time is even identical unless there are differences in diffing results that can make traversals diverge. I haven’t found ways to validate this claim at a large scale, though, so I want to be cautious at this point.

gitoxide on  main [!?⇣] is 📦 v0.40.0 via 🦀 v1.84.0 took 5s
❯ hyperfine "$HOME/github/Byron/gitoxide/target/release/gix blame STABILITY.md" "$HOME/worktrees/gitoxide/branch-1/target/release/gix blame STABILITY.md"
Benchmark 1: /home/christoph/github/Byron/gitoxide/target/release/gix blame STABILITY.md
  Time (mean ± σ):     194.6 ms ±   2.3 ms    [User: 174.0 ms, System: 19.8 ms]
  Range (min … max):   190.8 ms … 198.5 ms    15 runs

Benchmark 2: /home/christoph/worktrees/gitoxide/branch-1/target/release/gix blame STABILITY.md
  Time (mean ± σ):      99.2 ms ±   1.5 ms    [User: 87.3 ms, System: 11.3 ms]
  Range (min … max):    97.1 ms … 104.1 ms    29 runs

Summary
  '/home/christoph/worktrees/gitoxide/branch-1/target/release/gix blame STABILITY.md' ran
    1.96 ± 0.04 times faster than '/home/christoph/github/Byron/gitoxide/target/release/gix blame STABILITY.md'

@cruessler cruessler marked this pull request as ready for review January 21, 2025 08:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant