From e9a664867ffdcdbf3309218e6b904a4b4597d228 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jakub=20Ber=C3=A1nek?= Date: Mon, 4 Sep 2023 21:31:00 +0200 Subject: [PATCH] Add section about building an optimized version of `rustc` --- src/SUMMARY.md | 1 + src/building/optimized-build.md | 124 ++++++++++++++++++++++++++++++++ 2 files changed, 125 insertions(+) create mode 100644 src/building/optimized-build.md diff --git a/src/SUMMARY.md b/src/SUMMARY.md index c8481567f7..101fcf880c 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -14,6 +14,7 @@ - [Building Documentation](./building/compiler-documenting.md) - [Rustdoc overview](./rustdoc.md) - [Adding a new target](./building/new-target.md) + - [Optimized build](./building/optimized-build.md) - [Testing the compiler](./tests/intro.md) - [Running tests](./tests/running.md) - [Testing with Docker](./tests/docker.md) diff --git a/src/building/optimized-build.md b/src/building/optimized-build.md new file mode 100644 index 0000000000..81ad725ece --- /dev/null +++ b/src/building/optimized-build.md @@ -0,0 +1,124 @@ +# Optimized build of the compiler + + + +There are multiple additional build configuration options and techniques that can used to compile a build of `rustc` +that is as optimized as possible (for example when building `rustc` for a Linux distribution). The status of these +configuration options for various Rust targets is tracked [here]. This page describes how you can use these approaches +when building `rustc` yourself. + +[here]: /~https://github.com/rust-lang/rust/issues/103595 + +## Link-time optimization + +Link-time optimization is a powerful compiler technique that can increase program performance. To enable (Thin-)LTO when +building `rustc`, set the `rust.lto` config option to `"thin"` in `config.toml`: + +```toml +[rust] +lto = "thin" +``` + +> Note that LTO for `rustc` is currently supported and tested only for the `x86_64-unknown-linux-gnu` target. Other +> targets *may* work, but no guarantees are provided. Notably, LTO optimized `rustc` currently produces +> [miscompilations] on Windows. + +[miscompilations]: /~https://github.com/rust-lang/rust/issues/109114 + +Enabling LTO on Linux has [produced] speed-ups by up to 10%. + +[produced]: /~https://github.com/rust-lang/rust/pull/101403#issuecomment-1288190019 + +## Memory allocator + +Using a different memory allocator for `rustc` can provide significant performance benefits. If you want to enable +the `jemalloc` allocator, you can set the `rust.jemalloc` option to `true` in `config.toml`: + +```toml +[rust] +jemalloc = true +``` + +> Note that this option is currently only supported for Linux and macOS targets. + +## Codegen units + +Reducing the amount of codegen units per `rustc` crate can produce a faster build of the compiler. You can modify the +number of codegen units for `rustc` and `libstd` in `config.toml` with the following options: + +```toml +[rust] +codegen-units = 1 +codegen-units-std = 1 +``` + +## Instruction set + +By default, `rustc` is compiled for a generic (and conservative) instruction set architecture (depending on the selected +target), to make it support as many CPUs as possible. If you want to compile `rustc` for a specific instruction +set architecture, you can set the `target_cpu` compiler option in `RUSTFLAGS`: + +```bash +$ RUSTFLAGS="-C target_cpu=x86-64-v3" x.py build ... +``` + +If you also want to compile LLVM for a specific instruction set, you can set `llvm` flags in `config.toml`: + +```toml +[llvm] +cxxflags = "-march=x86-64-v3" +cflags = "-march=x86-64-v3" +``` + +## Profile-guided optimization + +Applying profile-guided optimizations (or more generally, feedback-directed optimizations) can produce a large increase +to `rustc` performance, by up to 25%. However, these techniques are not simply enabled by a configuration option, +but rather they require a complex build workflow that compiles `rustc` multiple times and profiles it on selected +benchmarks. + +There is a tool called `opt-dist` that is used to optimize `rustc` with [PGO] (profile-guided optimizations) and [BOLT] +(a post-link binary optimizer) for builds distributed to end users. You can examine the tool, which is located +in `src/tools/opt-dist`, and build a custom PGO build workflow based on it, or try to use it directly. Note that the +tool is currently quite hardcoded to the way we use it in Rust's continuous integration workflows, and it might require +some custom changes to make it work in a different environment. + +[PGO]: https://doc.rust-lang.org/rustc/profile-guided-optimization.html + +[BOLT]: /~https://github.com/llvm/llvm-project/blob/main/bolt/README.md + +To use the tool, you will need to provide some external dependencies: + +- A Python3 interpreter (for executing `x.py`). +- Compiled LLVM toolchain, with the `llvm-profdata` binary. Optionally, if you want to use BOLT, the `llvm-bolt` and + `merge-fdata` binaries have to be available in the toolchain. +- Downloaded [Rust benchmark suite]. (You can also let the tool download it itself, if you implement a custom + environment, see below). + +These dependencies are provided to `opt-dist` by an implementation of the [`Environment`] trait. You can either +implement the trait for your custom environment, by providing paths to these dependencies in its methods, or reuse one +of the existing implementations (currently, there is an implementation for Linux and Windows). If you want your +environment to support BOLT, return `true` from the `supports_bolt` method. + +Here is an example of how can `opt-dist` be used with the default Linux environment (it assumes that you execute the +following commands on a Linux system): + +1. Build the tool with the following command: + ```bash + $ python3 x.py build tools/opt-dist + ``` +2. Run the tool with the `PGO_HOST` environment variable set to the 64-bit Linux target: + ```bash + $ PGO_HOST=x86_64-unknown-linux-gnu ./build/host/stage0-tools-bin/opt-dist + ``` + Note that the default Linux environment expects several hardcoded paths to exist: + - `/checkout` should contain a checkout of the Rust compiler repository that will be compiled. + - `/rustroot` should contain the compiled LLVM toolchain (containing BOLT). + - A Python 3 interpreter should be available under the `python3` binary. + - `/tmp/rustc-perf` should contain a downloaded checkout of the Rust benchmark suite. + +You can modify `LinuxEnvironment` (or implement your own) to override these paths. + +[`Environment`]: /~https://github.com/rust-lang/rust/blob/65e468f9c259749c210b1ae8972bfe14781f72f1/src/tools/opt-dist/src/environment/mod.rs#L8-L7 + +[Rust benchmark suite]: /~https://github.com/rust-lang/rustc-perf