-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't check out the crates.io index locally #4026
Conversation
r? @brson (rust_highfive has picked a reviewer for you, use r? to override) |
r? @matklad |
src/cargo/sources/registry/remote.rs
Outdated
|
||
// Note that this `'static lifetime here is actually a lie, it's actually a | ||
// borrow into the `repo` object below. We're guaranteed, though, that if | ||
// filled in `tree` will be destroyed first, so this should be ok. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What guarantees that tree will be destroyed first? Unspecified drop order? In that case, maybe this is a good case for ManuallyDrop
? Or is that not yet usable in cargo due to instability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexcrichton ping. Should there be a crate for this sort of stuff? /~https://github.com/Kimundi/owning-ref-rs perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry missed this! Yes the unspecified drop order is what guarantees this. Lots of projects are relying on this so I don't think it's necessary to go out of the way and use ManuallyDrop
, and yeah I'd also prefer to keep Cargo on stable.
@matklad unfortunately that crate won't help as it's targeted at Rust pointers, whereas here it's all phantom lifetimes through libgit2 :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's in an Option
, you could explicitly take()
it first in an impl Drop for RemoteRegistry
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed! I'll do that.
☔ The latest upstream changes (presumably #4024) made this pull request unmergeable. Please resolve the merge conflicts. |
6bbd534
to
dcbbfab
Compare
src/cargo/sources/registry/index.rs
Outdated
// interpretation of each line here and older cargo will simply | ||
// ignore the new lines. | ||
let lines = contents.split(|b| *b == b'\n') | ||
.filter_map(|b| str::from_utf8(b).ok()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we should not swallow utf8-decoding errors here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or are we planing to switch to binary format some day?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah this was mostly inspired from discussion on the RFC about schema versioning. I have no plans to break this personally, but it seems reasonable to be somewhat defensive about future changes to the index just for maximal flexibility of future cargo's implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm totally ok with self.parse_registry_package(line).ok()
, it's only str::from_utf8(b).ok()
that feels overly defensive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable!
src/cargo/sources/registry/local.rs
Outdated
@@ -34,7 +35,11 @@ impl<'cfg> RegistryData for LocalRegistry<'cfg> { | |||
&self.index_path | |||
} | |||
|
|||
fn config(&self) -> CargoResult<Option<RegistryConfig>> { | |||
fn load(&self, root: &Path, path: &str) -> CargoResult<Vec<u8>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not path: &Path
? We convert str
to Path
both for Local and Remote registry anyway. Those slashes format!("{}/{}/{}", &fs_name[0..2], &fs_name[2..4], fs_name)
make me nervous :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And do we need root
here? Can't we reconstruct it from index_path
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah my thinking of passing in root
is that index_path
returns a Filesystem
which is an "unlocked path", but here we've always got a locked path (locked elsewhere) so looking at a Path
is proof of that.
I was originally unsure what would happen if we take a \
-separated path when we go down to libgit2, I'm not sure if it handles internally the slash differences. Only one way to find out!
src/cargo/sources/registry/remote.rs
Outdated
// Note that this `'static lifetime here is actually a lie, it's actually a | ||
// borrow into the `repo` object below. We're guaranteed, though, that if | ||
// filled in `tree` will be destroyed first, so this should be ok. | ||
tree: LazyCell<RefCell<Option<git2::Tree<'static>>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need LazyCell
on top of RefCell
? LazyCell
is useful to return &T
and not Ref<T>
, but we are returning a Ref
anyway, so just RefCell<Option<git2::Tree<'static>>>
should be enough levels of indirection...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm excellent point!
1ea815a
to
a4e850f
Compare
Pushed some updates |
src/cargo/sources/registry/remote.rs
Outdated
let handle = ops::http_handle(self.config)?; | ||
self.handle.fill(RefCell::new(handle)).ok().unwrap(); | ||
Ok(self.handle.borrow().unwrap()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like easy
and repo
could use LazyCell::get_or_try_init
instead of manually unwrapping things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha yes indeed! I thought I tried that but thanks for catching
e51d729
to
c2e82c2
Compare
Updated |
LGTM, though there was seemingly legitimate failure on appveyor on the previous build. |
7ff4bca
to
b7414ce
Compare
Bah looks like libgit2 cares about \ vs / |
@bors: r=matklad |
📌 Commit b7414ce has been approved by |
🔒 Merge conflict |
☔ The latest upstream changes (presumably #4032) made this pull request unmergeable. Please resolve the merge conflicts. |
This commit moves working with the crates.io index to operating on the git object layers rather than actually literally checking out the index. This is aimed at two different goals: * Improving the on-disk file size of the registry * Improving cloning times for the registry as the index doesn't need to be checked out The on disk size of my `registry` folder of a fresh check out of the index went form 124M to 48M, saving a good chunk of space! The entire operation took about 0.6s less on a Unix machine (out of 4.7s total for current Cargo). On Windows, however, the clone operation went from 11s to 6.7s, a much larger improvement! Closes rust-lang#4015
b7414ce
to
15cc376
Compare
@bors: r=matklad |
📌 Commit 15cc376 has been approved by |
Don't check out the crates.io index locally This commit moves working with the crates.io index to operating on the git object layers rather than actually literally checking out the index. This is aimed at two different goals: * Improving the on-disk file size of the registry * Improving cloning times for the registry as the index doesn't need to be checked out The on disk size of my `registry` folder of a fresh check out of the index went form 124M to 48M, saving a good chunk of space! The entire operation took about 0.6s less on a Unix machine (out of 4.7s total for current Cargo). On Windows, however, the clone operation went from 11s to 6.7s, a much larger improvement! Closes #4015
☀️ Test successful - status-appveyor, status-travis |
Allows for seamless ransition for when rust-lang/cargo#4026 lands Closes #32
This commit moves working with the crates.io index to operating on the git
object layers rather than actually literally checking out the index. This is
aimed at two different goals:
checked out
The on disk size of my
registry
folder of a fresh check out of the index wentform 124M to 48M, saving a good chunk of space! The entire operation took about
0.6s less on a Unix machine (out of 4.7s total for current Cargo). On Windows,
however, the clone operation went from 11s to 6.7s, a much larger improvement!
Closes #4015