Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filesystem Monitoring for Worktree Changes #5982

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Byron
Copy link
Collaborator

@Byron Byron commented Jan 17, 2025

This PR is about evaluating how to do portable and performance worktree tracking so that changes to the worktree can remain in sync with what's on Disk.
With such a system, keeping the app in sync with what's in disk will have higher performance than it could have if it would do a whole git status each time
something changes on disk.

Requirements

  • portable
    • Support for Linux, Windows and MacOS
  • Multi-Worktree
    • One 'system' can handle many worktrees. So if it's a separate binary, it should be able to deal with worktrees of many Git repositories at once.
  • Poll-fallback
    • Not all filesystems support efficient watching, but there should still be a way to learn about changes, possibly by implementing the essence of the watching by polling.
  • Support for .git changes
    • Allow to get notified if a ref changes, and if the index changes, or anything else that is relevant to what's shown in the UI.

Verdict of Research

Even though one could spend time on making the builtin Git daemon usable so that we can…

  • …start it on demand
  • …find its socket and communicate with it

…it won't actually do anything that our own file system watcher isn't doing already. Further, it won't give any paths that are inside the .git directory.
I'd rather spend the time making our own filesystem monitor better (possibly after even by contributing fixes to the upstream project) than dealing with Gits monitoring daemon that we can't control nor patch easily.

Something we can certainly learn from is its event handling, it's well thought out and made for robustness.

Also there is always a variant of this 'subscription' system that needs to work without the monitor for network filesystems, so I don't think using Git fsmonitor saves any time (quite the opposite). Let's trust in Rust.

Tasks

  • evaluate git fsmonitor
  • sketch crate API
  • sketch fallback
  • implementation
  • cli status -w to watch in the CLI
  • add a setting to disable the watcher mechanism entirely
    • This is a setting that might be good to have in Git so it can be layered.

Research

git fsmonitor--daemon

Here are the docs.

  • supports external 'hooks' watcher processes or built-in implementation
  • there are three protocols, two for the hooks (V1 and V2) with marginal differences (timestamp vs opaque token), and a more efficient one (IPC) used for the internal implementation.
  • The Git daemon's memory footprint is minimal (2x 5MB)
  • on WebKit, when using the Git daemon, git status takes 600ms to ~900ms, without it it takes 2.8s. But it's far from instant.
  • A hook-based monitor is of the watchman variant, which is very portable, but also a memory hog. Also it's definitely not installed and an installation is isn't always easy as it contains C extensions.
  • The built-in monitor daemon is available since Git 2.36 or so
  • By default, the fsmonitor seems to just speed up the non-untracked-files part of the git status operation, and to speed up untracked files one needs to enable the untracked cache as well, at least for Git to make use of it fully.
  • The daemon can be started on demand, which happens with git status or whatever client we know.
  • the IPC protocol uses packetlines, and the idea is to connect to the named pipe or socket, send a request and receive response packet lines with a flush packet signalling the response end.
  • Shortcomings about the Git fsdaemon
    • no notion of a submodule, so reports these changes as happening in the superrepo. Client has to handle that.
    • doesn't usually work correctly on network mounted filesystems, but can be forced to try anyway with fsmonitor.allowRemote=true
    • the Unix Domain Socket is placed in the .git/ directory where the daemon is available, and that isn't supported on every filesystem the repository could be stored on. If that's the case, the socket is created in $HOME/git-fsmonitor-* or else the location of fsmonitor.socketDir is used. Finding the right socket isn't entirely trivial but there are sources for that.
  • builtin Git fsmonitor communication
    • Code for finding the domain socket path: fsmonitor_ipc__get_path
    • refresh_fsmonitor
    • It's notable that the backend would need a way to store the token to prevent asking for old files each time. Also we'd either let it up and running, or try to shut it down ourselves if we know we started it. And determining this might not be trivial.
    • all the code in compat/fsmonitor/

For later

  • provide patch for a change
  • locking information

Copy link

vercel bot commented Jan 17, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
gitbutler-components ⬜️ Skipped (Inspect) Jan 17, 2025 3:35pm

Copy link

vercel bot commented Jan 17, 2025

@Byron is attempting to deploy a commit to the GitButler Team on Vercel.

A member of the Team first needs to authorize it.

@vercel vercel bot temporarily deployed to Preview – gitbutler-components January 17, 2025 15:35 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rust Pull requests that update Rust code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant