-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable ThinLTO for rustc on x86_64-apple-darwin
#103647
Conversation
@bors try |
⌛ Trying commit 39bf1507ecf95e4f5b066de65cf9929a7650e1c3 with merge b4d23f6f6fd8602735600f2aa54e884005c277e8... |
☀️ Try build successful - checks-actions |
The job completed in 2h57 which is maybe too close to a timeout ? Looking at the first couple pages of successful CI jobs, the range was from 2h16 to 2h45. |
(I've removed the CI hacks) The only x86 darwin system I have access to is pretty noisy (old iMac), making the majority of the So I've tried the longer benchmarks, on full clean builds (not only the leaf crate like the perf collector does) it seems that on this machine the ThinLTOed rustc Two things would be interesting to know:
For the two questions above, I'll r? @Mark-Simulacrum to have your thoughts if that's OK. |
x86_64-apple-darwin
x86_64-apple-darwin
I'm not too worried about claiming this is a win for Macs if it was a win for Linux; that seems like a pretty reasonable claim in my eyes. Especially if we can have some validation on noisier machines. That said I think making this switch will probably bring too high a cost at this time, given that these runners dominate our runtimes. It's a little bit annoying in the sense that making the change for just beta or similar might drive down our runtimes on nightly and give more room for this kind of optimization... I think if we don't see headroom through other ways (e.g., different CI runners, optimizations to the things running in them, etc) then it might be worth trying that. It feels relatively safe as a beta-exclusive (i.e. not on nightly or stable), since beta sees very little usage anyway. |
That makes sense, thanks a bunch. I'll mark this blocked for now rather than close it outright (but feel free to do so if you prefer): so that we can revisit in the future and decide which of the two paths to take then. |
With the new osx builders from #105212, I believe we should now have the capacity to enable this change: @rustbot label -S-blocked What do you think @Mark-Simulacrum, can we land this ? |
Let's merge this, mostly so that we can get some data gathered on what the hit is. We'll reconsider if it is a significant increase that begins to drive the slowest builds again. For the record: So looks like we're around 1 hour per apple dist builder right now. @bors r+ |
⌛ Testing commit 3a085f7 with merge 32d70cc436f06f18c510b2e964db6f7c21b5bbf3... |
💔 Test failed - checks-actions |
@bors retry Failed to update toolstate |
This comment was marked as resolved.
This comment was marked as resolved.
☀️ Test successful - checks-actions |
Finished benchmarking commit (bdb07a8): comparison URL. Overall result: no relevant changes - no action needed@rustbot label: -perf-regression Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. |
Revert "enable ThinLTO for rustc on x86_64-apple-darwin dist builds" Apparently ThinLTO on x64 mac can regress some of the ICEs' output. This reverts rust-lang#103647 to allow for investigation, and fixes rust-lang#105637 in the meantime.
Revert "enable ThinLTO for rustc on x86_64-apple-darwin dist builds" Apparently ThinLTO on x64 mac can regress some of the ICEs' output. This reverts rust-lang#103647 to allow for investigation, and helps with rust-lang#105637 in the meantime.
Enable ThinLTO for rustc on `x86_64-apple-darwin` Local measurements seemed to show an improvement on a couple benchmarks, so I'd like to test real CI builds, and see if the builder doesn't timeout with the expected slight increase in build times. Let's start with x64 rustc ThinLTO, and then figure out the file structure to configure LLVM ThinLTO. Maybe we'll then try `aarch64` builds since that also looked good locally.
Pkgsrc changes: * Adjust patches and cargo checksums to new versions. Upstream changes: Version 1.68.0 (2023-03-09) =========================== Language -------- - [Stabilize default_alloc_error_handler] (rust-lang/rust#102318) This allows usage of `alloc` on stable without requiring the definition of a handler for allocation failure. Defining custom handlers is still unstable. - [Stabilize `efiapi` calling convention.] (rust-lang/rust#105795) - [Remove implicit promotion for types with drop glue] (rust-lang/rust#105085) Compiler -------- - [Change `bindings_with_variant_name` to deny-by-default] (rust-lang/rust#104154) - [Allow .. to be parsed as let initializer] (rust-lang/rust#105701) - [Add `armv7-sony-vita-newlibeabihf` as a tier 3 target] (rust-lang/rust#105712) - [Always check alignment during compile-time const evaluation] (rust-lang/rust#104616) - [Disable "split dwarf inlining" by default.] (rust-lang/rust#106709) - [Add vendor to Fuchsia's target triple] (rust-lang/rust#106429) - [Enable sanitizers for s390x-linux] (rust-lang/rust#107127) Libraries --------- - [Loosen the bound on the Debug implementation of Weak.] (rust-lang/rust#90291) - [Make `std::task::Context` !Send and !Sync] (rust-lang/rust#95985) - [PhantomData layout guarantees] (rust-lang/rust#104081) - [Don't derive Debug for `OnceWith` & `RepeatWith`] (rust-lang/rust#104163) - [Implement DerefMut for PathBuf] (rust-lang/rust#105018) - [Add O(1) `Vec -> VecDeque` conversion guarantee] (rust-lang/rust#105128) - [Leak amplification for peek_mut() to ensure BinaryHeap's invariant is always met] (rust-lang/rust#105851) Stabilized APIs --------------- - [`{core,std}::pin::pin!`] (https://doc.rust-lang.org/stable/std/pin/macro.pin.html) - [`impl From<bool> for {f32,f64}`] (https://doc.rust-lang.org/stable/std/primitive.f32.html#impl-From%3Cbool%3E-for-f32) - [`std::path::MAIN_SEPARATOR_STR`] (https://doc.rust-lang.org/stable/std/path/constant.MAIN_SEPARATOR_STR.html) - [`impl DerefMut for PathBuf`] (https://doc.rust-lang.org/stable/std/path/struct.PathBuf.html#impl-DerefMut-for-PathBuf) These APIs are now stable in const contexts: - [`VecDeque::new`] (https://doc.rust-lang.org/stable/std/collections/struct.VecDeque.html#method.new) Cargo ----- - [Stabilize sparse registry support for crates.io] (rust-lang/cargo#11224) - [`cargo build --verbose` tells you more about why it recompiles.] (rust-lang/cargo#11407) - [Show progress of crates.io index update even `net.git-fetch-with-cli` option enabled] (rust-lang/cargo#11579) Misc ---- Compatibility Notes ------------------- - [Add `SEMICOLON_IN_EXPRESSIONS_FROM_MACROS` to future-incompat report] (rust-lang/rust#103418) - [Only specify `--target` by default for `-Zgcc-ld=lld` on wasm] (rust-lang/rust#101792) - [Bump `IMPLIED_BOUNDS_ENTAILMENT` to Deny + ReportNow] (rust-lang/rust#106465) - [`std::task::Context` no longer implements Send and Sync] (rust-lang/rust#95985) nternal Changes ---------------- These changes do not affect any public interfaces of Rust, but they represent significant improvements to the performance or internals of rustc and related tools. - [Encode spans relative to the enclosing item] (rust-lang/rust#84762) - [Don't normalize in AstConv] (rust-lang/rust#101947) - [Find the right lower bound region in the scenario of partial order relations] (rust-lang/rust#104765) - [Fix impl block in const expr] (rust-lang/rust#104889) - [Check ADT fields for copy implementations considering regions] (rust-lang/rust#105102) - [rustdoc: simplify JS search routine by not messing with lev distance] (rust-lang/rust#105796) - [Enable ThinLTO for rustc on `x86_64-pc-windows-msvc`] (rust-lang/rust#103591) - [Enable ThinLTO for rustc on `x86_64-apple-darwin`] (rust-lang/rust#103647)
Pkgsrc changes: * Adjust patches (add & remove) and cargo checksums to new versions. * It's conceivable that the workaround for LLVM based NetBSD works even less in this version (ref. PKGSRC_HAVE_LIBCPP not having a corresponding patch anymore). Upstream changes: Version 1.68.2 (2023-03-28) =========================== - [Update the GitHub RSA host key bundled within Cargo] (rust-lang/cargo#11883). The key was [rotated by GitHub] (https://github.blog/2023-03-23-we-updated-our-rsa-ssh-host-key/) on 2023-03-24 after the old one leaked. - [Mark the old GitHub RSA host key as revoked] (rust-lang/cargo#11889). This will prevent Cargo from accepting the leaked key even when trusted by the system. - [Add support for `@revoked` and a better error message for `@cert-authority` in Cargo's SSH host key verification] (rust-lang/cargo#11635) Version 1.68.1 (2023-03-23) =========================== - [Fix miscompilation in produced Windows MSVC artifacts] (rust-lang/rust#109094) This was introduced by enabling ThinLTO for the distributed rustc which led to miscompilations in the resulting binary. Currently this is believed to be limited to the -Zdylib-lto flag used for rustc compilation, rather than a general bug in ThinLTO, so only rustc artifacts should be affected. - [Fix --enable-local-rust builds] (rust-lang/rust#109111) - [Treat `$prefix-clang` as `clang` in linker detection code] (rust-lang/rust#109156) - [Fix panic in compiler code] (rust-lang/rust#108162) Version 1.68.0 (2023-03-09) =========================== Language -------- - [Stabilize default_alloc_error_handler] (rust-lang/rust#102318) This allows usage of `alloc` on stable without requiring the definition of a handler for allocation failure. Defining custom handlers is still unstable. - [Stabilize `efiapi` calling convention.] (rust-lang/rust#105795) - [Remove implicit promotion for types with drop glue] (rust-lang/rust#105085) Compiler -------- - [Change `bindings_with_variant_name` to deny-by-default] (rust-lang/rust#104154) - [Allow .. to be parsed as let initializer] (rust-lang/rust#105701) - [Add `armv7-sony-vita-newlibeabihf` as a tier 3 target] (rust-lang/rust#105712) - [Always check alignment during compile-time const evaluation] (rust-lang/rust#104616) - [Disable "split dwarf inlining" by default.] (rust-lang/rust#106709) - [Add vendor to Fuchsia's target triple] (rust-lang/rust#106429) - [Enable sanitizers for s390x-linux] (rust-lang/rust#107127) Libraries --------- - [Loosen the bound on the Debug implementation of Weak.] (rust-lang/rust#90291) - [Make `std::task::Context` !Send and !Sync] (rust-lang/rust#95985) - [PhantomData layout guarantees] (rust-lang/rust#104081) - [Don't derive Debug for `OnceWith` & `RepeatWith`] (rust-lang/rust#104163) - [Implement DerefMut for PathBuf] (rust-lang/rust#105018) - [Add O(1) `Vec -> VecDeque` conversion guarantee] (rust-lang/rust#105128) - [Leak amplification for peek_mut() to ensure BinaryHeap's invariant is always met] (rust-lang/rust#105851) Stabilized APIs --------------- - [`{core,std}::pin::pin!`] (https://doc.rust-lang.org/stable/std/pin/macro.pin.html) - [`impl From<bool> for {f32,f64}`] (https://doc.rust-lang.org/stable/std/primitive.f32.html#impl-From%3Cbool%3E-for-f32) - [`std::path::MAIN_SEPARATOR_STR`] (https://doc.rust-lang.org/stable/std/path/constant.MAIN_SEPARATOR_STR.html) - [`impl DerefMut for PathBuf`] (https://doc.rust-lang.org/stable/std/path/struct.PathBuf.html#impl-DerefMut-for-PathBuf) These APIs are now stable in const contexts: - [`VecDeque::new`] (https://doc.rust-lang.org/stable/std/collections/struct.VecDeque.html#method.new) Cargo ----- - [Stabilize sparse registry support for crates.io] (rust-lang/cargo#11224) - [`cargo build --verbose` tells you more about why it recompiles.] (rust-lang/cargo#11407) - [Show progress of crates.io index update even `net.git-fetch-with-cli` option enabled] (rust-lang/cargo#11579) Misc ---- Compatibility Notes ------------------- - [Add `SEMICOLON_IN_EXPRESSIONS_FROM_MACROS` to future-incompat report] (rust-lang/rust#103418) - [Only specify `--target` by default for `-Zgcc-ld=lld` on wasm] (rust-lang/rust#101792) - [Bump `IMPLIED_BOUNDS_ENTAILMENT` to Deny + ReportNow] (rust-lang/rust#106465) - [`std::task::Context` no longer implements Send and Sync] (rust-lang/rust#95985) nternal Changes ---------------- These changes do not affect any public interfaces of Rust, but they represent significant improvements to the performance or internals of rustc and related tools. - [Encode spans relative to the enclosing item] (rust-lang/rust#84762) - [Don't normalize in AstConv] (rust-lang/rust#101947) - [Find the right lower bound region in the scenario of partial order relations] (rust-lang/rust#104765) - [Fix impl block in const expr] (rust-lang/rust#104889) - [Check ADT fields for copy implementations considering regions] (rust-lang/rust#105102) - [rustdoc: simplify JS search routine by not messing with lev distance] (rust-lang/rust#105796) - [Enable ThinLTO for rustc on `x86_64-pc-windows-msvc`] (rust-lang/rust#103591) - [Enable ThinLTO for rustc on `x86_64-apple-darwin`] (rust-lang/rust#103647) Version 1.67.0 (2023-01-26) ========================== Language -------- - [Make `Sized` predicates coinductive, allowing cycles.] (rust-lang/rust#100386) - [`#[must_use]` annotations on `async fn` also affect the `Future::Output`.] (rust-lang/rust#100633) - [Elaborate supertrait obligations when deducing closure signatures.] (rust-lang/rust#101834) - [Invalid literals are no longer an error under `cfg(FALSE)`.] (rust-lang/rust#102944) - [Unreserve braced enum variants in value namespace.] (rust-lang/rust#103578) Compiler -------- - [Enable varargs support for calling conventions other than `C` or `cdecl`.] (rust-lang/rust#97971) - [Add new MIR constant propagation based on dataflow analysis.] (rust-lang/rust#101168) - [Optimize field ordering by grouping m\*2^n-sized fields with equivalently aligned ones.] (rust-lang/rust#102750) - [Stabilize native library modifier `verbatim`.] (rust-lang/rust#104360) Added and removed targets: - [Add a tier 3 target for PowerPC on AIX] (rust-lang/rust#102293), `powerpc64-ibm-aix`. - [Add a tier 3 target for the Sony PlayStation 1] (rust-lang/rust#102689), `mipsel-sony-psx`. - [Add tier 3 `no_std` targets for the QNX Neutrino RTOS] (rust-lang/rust#102701), `aarch64-unknown-nto-qnx710` and `x86_64-pc-nto-qnx710`. - [Remove tier 3 `linuxkernel` targets] (rust-lang/rust#104015) (not used by the actual kernel). Refer to Rust's [platform support page][platform-support-doc] for more information on Rust's tiered platform support. Libraries --------- - [Merge `crossbeam-channel` into `std::sync::mpsc`.] (rust-lang/rust#93563) - [Fix inconsistent rounding of 0.5 when formatted to 0 decimal places.] (rust-lang/rust#102935) - [Derive `Eq` and `Hash` for `ControlFlow`.] (rust-lang/rust#103084) - [Don't build `compiler_builtins` with `-C panic=abort`.] (rust-lang/rust#103786) Stabilized APIs --------------- - [`{integer}::checked_ilog`] (https://doc.rust-lang.org/stable/std/primitive.i32.html#method.checked_ilog) - [`{integer}::checked_ilog2`] (https://doc.rust-lang.org/stable/std/primitive.i32.html#method.checked_ilog2) - [`{integer}::checked_ilog10`] (https://doc.rust-lang.org/stable/std/primitive.i32.html#method.checked_ilog10) - [`{integer}::ilog`] (https://doc.rust-lang.org/stable/std/primitive.i32.html#method.ilog) - [`{integer}::ilog2`] (https://doc.rust-lang.org/stable/std/primitive.i32.html#method.ilog2) - [`{integer}::ilog10`] (https://doc.rust-lang.org/stable/std/primitive.i32.html#method.ilog10) - [`NonZeroU*::ilog2`] (https://doc.rust-lang.org/stable/std/num/struct.NonZeroU32.html#method.ilog2) - [`NonZeroU*::ilog10`] (https://doc.rust-lang.org/stable/std/num/struct.NonZeroU32.html#method.ilog10) - [`NonZero*::BITS`] (https://doc.rust-lang.org/stable/std/num/struct.NonZeroU32.html#associatedconstant.BITS) These APIs are now stable in const contexts: - [`char::from_u32`] (https://doc.rust-lang.org/stable/std/primitive.char.html#method.from_u32) - [`char::from_digit`] (https://doc.rust-lang.org/stable/std/primitive.char.html#method.from_digit) - [`char::to_digit`] (https://doc.rust-lang.org/stable/std/primitive.char.html#method.to_digit) - [`core::char::from_u32`] (https://doc.rust-lang.org/stable/core/char/fn.from_u32.html) - [`core::char::from_digit`] (https://doc.rust-lang.org/stable/core/char/fn.from_digit.html) Compatibility Notes ------------------- - [The layout of `repr(Rust)` types now groups m\*2^n-sized fields with equivalently aligned ones.] (rust-lang/rust#102750) This is intended to be an optimization, but it is also known to increase type sizes in a few cases for the placement of enum tags. As a reminder, the layout of `repr(Rust)` types is an implementation detail, subject to change. - [0.5 now rounds to 0 when formatted to 0 decimal places.] (rust-lang/rust#102935) This makes it consistent with the rest of floating point formatting that rounds ties toward even digits. - [Chains of `&&` and `||` will now drop temporaries from their sub-expressions in evaluation order, left-to-right.] (rust-lang/rust#103293) Previously, it was "twisted" such that the _first_ expression dropped its temporaries _last_, after all of the other expressions dropped in order. - [Underscore suffixes on string literals are now a hard error.] (rust-lang/rust#103914) This has been a future-compatibility warning since 1.20.0. - [Stop passing `-export-dynamic` to `wasm-ld`.] (rust-lang/rust#105405) - [`main` is now mangled as `__main_void` on `wasm32-wasi`.] (rust-lang/rust#105468) - [Cargo now emits an error if there are multiple registries in the configuration with the same index URL.] (rust-lang/cargo#10592) Internal Changes ---------------- These changes do not affect any public interfaces of Rust, but they represent significant improvements to the performance or internals of rustc and related tools. - [Rewrite LLVM's archive writer in Rust.] (rust-lang/rust#97485)
Local measurements seemed to show an improvement on a couple benchmarks, so I'd like to test real CI builds, and see if the builder doesn't timeout with the expected slight increase in build times.
Let's start with x64 rustc ThinLTO, and then figure out the file structure to configure LLVM ThinLTO. Maybe we'll then try
aarch64
builds since that also looked good locally.