Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rustc cache #5359

Merged
merged 11 commits into from
Apr 16, 2018
Merged

Rustc cache #5359

merged 11 commits into from
Apr 16, 2018

Conversation

matklad
Copy link
Member

@matklad matklad commented Apr 14, 2018

This implements rustc caching, to speed-up no-op builds.

The cache is per-project, and stored in target directory. To implement this, I had to move rustc from Config down to BuildConfig.

closes #5315

matklad added 4 commits April 14, 2018 11:12
We want to create `rustc` only when we already have workspace, so that
we could use the `target` dir to store cached info about the compiler.
@rust-highfive
Copy link

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

paths::mtime(&rustup_rustc)?.hash(&mut hasher);
}
_ => (),
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This match requires some scrutiny during review, because it is not tested, but kinda important :-)

We can't just look at RUSTUP_TOOLCHAIN, because the toolchain name does not change when you do rustup update, it is just stable-x86_64-unknown-linux-gnu. So we actually have to look inside the toolchain and mtime the real rustc binary.

This still has a failure mode unfortunately :( If the cargo is not rustup, but rustc is rustup, then these env variables won't be set, and we won't be able to detect compieler changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One possible mitigation here I think would be to look at the result of the resolved executable. If rustc resolves to $CARGO_HOME/.bin/rustc then I think we can say with a high level of certainty that it's a rustup rustc. We could even say that if we detect a rustup rustc and RUSTUP_HOME and/or RUSTUP_TOOLCHAIN isn't set that we skip this cache entirely.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, implemented that logic!

@matklad
Copy link
Member Author

matklad commented Apr 14, 2018

@bors try

@bors
Copy link
Contributor

bors commented Apr 14, 2018

⌛ Trying commit 42f347a with merge 88ab8ed...

bors added a commit that referenced this pull request Apr 14, 2018
Rustc cache

This implements rustc caching, to speed-up no-op builds.

The cache is per-project, and stored in `target` directory. To implement this, I had to move `rustc` from `Config` down to `BuildConfig`.

closes #5315
@bors
Copy link
Contributor

bors commented Apr 14, 2018

💔 Test failed - status-appveyor

Copy link
Contributor

@djc djc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite get what the earlier refactorings are trying to achieve. However, I highly appreciate that we should probably cache the output of the rustc invocation. Can't we just hide the cache inside Config::rustc(). Also, should we (first? I don't know) try to combine the calls to rustc for getting the version and the TargetInfo?

self.maybe_get_tool("rustc_wrapper")?,
)
})
pub fn new_rustc(&self) -> CargoResult<Rustc> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why rename this here? Is that just cosmetic? Maybe extract it into a separate commit that explains why this makes sense on its own?

if let Ok(rustc) = cx.config.rustc() {
rustc.verbose_version.hash(&mut hasher);
}
cx.build_config.rustc.verbose_version.hash(&mut hasher);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my perception, it seemed that build_config should probably stay immutable over the lifetime of the Context (and in fact that is enforced, I think, by #5358).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, not sure I understand this comment :) build_config is not mutated here at all?

@matklad
Copy link
Member Author

matklad commented Apr 15, 2018

I don't quite get what the earlier refactorings are trying to achieve. However, I highly appreciate that we should probably cache the output of the rustc invocation. Can't we just hide the cache inside Config::rustc(

I’ve initially tried to just add caching to Config.rustc, but ended up moving rustc to BuildConfig first. The reason is that we need to store the cache somewhere, and the logical choice is to store it in the target directory of the project. However, Config does not know the target directory, because it does not know the current project.

@djc
Copy link
Contributor

djc commented Apr 15, 2018

I’ve initially tried to just add caching to Config.rustc, but ended up moving rustc to BuildConfig first. The reason is that we need to store the cache somewhere, and the logical choice is to store it in the target directory of the project. However, Config does not know the target directory, because it does not know the current project.

Yeah, that part actually makes sense to me -- I had been thinking about doing the same. However, it seems you've made the mechanics of doing so more complicated than what seems necessary. I guess you had some other goals at the same time, or maybe this rustc thing is more complicated than what I've understood so far.

@matklad
Copy link
Member Author

matklad commented Apr 16, 2018

@bors try

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking fantastic to me, thanks so much for doing this @matklad!

One final aspect I'd want to consider as well is the rustbuild integration aspect. I think though that all the logic here is sufficient to detect changes in that situation, and we can always set the env var if need be to disable this caching entirely.

@@ -129,6 +158,12 @@ pub fn append(path: &Path, contents: &[u8]) -> CargoResult<()> {
Ok(())
}

pub fn mtime(path: &Path) -> CargoResult<FileTime> {
let meta =
fs::metadata(path).chain_err(|| internal(format!("failed to stat `{}`", path.display())))?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general internal hasn't worked out great for Cargo and I've been trying to remove it wherever I see it, so while you're here mind removing this internal?


/// It is a well known that `rustc` is not the fastest compiler in the world.
/// What is less known is that even `rustc --version --verbose` takes about a
/// hundred milliseconds! Because we need compiler version info even for no-op
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this cc the Cargo/rustc issues as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!


impl Drop for Cache {
fn drop(&mut self) {
match (&self.cache_location, self.dirty) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this could be started with:

if !self.dirty {
    return 
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er well I'll leave it up to you, I'm just personally always looking to aggressively reduce indent!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it looks much nicer that way, because I don't have to wrangle with not moving out cahce_location as well!


paths::mtime(&path)?.hash(&mut hasher);

match (env::var("RUSTUP_HOME"), env::var("RUSTUP_TOOLCHAIN")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you insert a comment here as to what's going on?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

paths::mtime(&rustup_rustc)?.hash(&mut hasher);
}
_ => (),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One possible mitigation here I think would be to look at the result of the resolved executable. If rustc resolves to $CARGO_HOME/.bin/rustc then I think we can say with a high level of certainty that it's a rustup rustc. We could even say that if we detect a rustup rustc and RUSTUP_HOME and/or RUSTUP_TOOLCHAIN isn't set that we skip this cache entirely.

@@ -260,7 +260,8 @@ pub fn compile_ws<'a>(
bail!("jobs must be at least 1")
}

let mut build_config = BuildConfig::new(config, jobs, &target)?;
let rustc_info_cache = ws.target_dir().join("rustc_info.json").into_path_unlocked();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind throwing a . on the front to hide it away?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️

@matklad
Copy link
Member Author

matklad commented Apr 16, 2018

However, it seems you've made the mechanics of doing so more complicated than what seems necessary.

Quite probably, but, if I did this, it is because of being stupid, and not because of being smart :-) What simplifications would you suggest?

@alexcrichton
Copy link
Member

@bors: r+

@bors
Copy link
Contributor

bors commented Apr 16, 2018

📌 Commit 76b0a8a has been approved by alexcrichton

@bors
Copy link
Contributor

bors commented Apr 16, 2018

⌛ Testing commit 76b0a8a with merge d0829a2...

bors added a commit that referenced this pull request Apr 16, 2018
Rustc cache

This implements rustc caching, to speed-up no-op builds.

The cache is per-project, and stored in `target` directory. To implement this, I had to move `rustc` from `Config` down to `BuildConfig`.

closes #5315
@matklad
Copy link
Member Author

matklad commented Apr 16, 2018

For posterity, a couple of more details:

  • write to the .rustc_info.json is racy across Cargo processes. This shoudn't be a problem because we always ignore errors in this file and, in practice, it should be mostly read-only because we write cache only if something did change.

  • it's interesting to compare this with what sccache does. sccache is actually pretty smart: for compiler fingerprint, it hashes the contents of rustc --print sysroot, which looks like a bulletproof way to do compiler fingerprinting. Unfortunately, we can't use this here because it requires to launch compiler once, the very thing we'd love to avoid.

@bors
Copy link
Contributor

bors commented Apr 16, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing d0829a2 to master...

@bors bors merged commit 76b0a8a into rust-lang:master Apr 16, 2018
@matklad matklad deleted the rustc-cache branch April 16, 2018 18:41
@matklad
Copy link
Member Author

matklad commented Apr 16, 2018

Let's tag it relnotes, because some folks might need to be aware of CARGO_CACHE_RUSTC_INFO, if they use something like rustup, but not rustup, to manage toolchains.

@matklad matklad added the relnotes Release-note worthy label Apr 16, 2018
ehuss added a commit to ehuss/cargo that referenced this pull request Jun 4, 2018
I think this was left behind as part of rust-lang#5359.
bors added a commit that referenced this pull request Jun 4, 2018
Remove some minor, unused code.

I think this was left behind as part of #5359.
@luser
Copy link
Contributor

luser commented Jul 27, 2018

it's interesting to compare this with what sccache does. sccache is actually pretty smart: for compiler fingerprint, it hashes the contents of rustc --print sysroot, which looks like a bulletproof way to do compiler fingerprinting. Unfortunately, we can't use this here because it requires to launch compiler once, the very thing we'd love to avoid.

Coming in here late but I ran into this issue while reading old release notes. Note that sccache only uses the hash of the sysroot libs as input to the hash function for compilation caching--it doesn't recalculate that hash every time (that'd be expensive!) it caches that hash internally with the same strategy you're using here (by comparing mtimes). sccache doesn't have the checks for rustup you added, we've got an open issue on handling rustup properly.

@ehuss ehuss added this to the 1.27.0 milestone Feb 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relnotes Release-note worthy
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cargo no-op build time is huge
7 participants