-
Notifications
You must be signed in to change notification settings - Fork 476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build with frame pointers for improved profiling #10224
Comments
This isn't the case. I looked at the assembly function prologue, which doesn't maintain frame pointers. This is confirmed by the rustc target specs, which default to Apple aarch64 does enable frame pointers, since this is required by Apple debug tooling: |
tikv-jemallocator builds jemalloc with frame pointers:
|
The |
Reopening in an attempt to fix pprof-rs seg faults. |
## Problem Frame pointers are typically disabled by default (depending on CPU architecture), to improve performance. This frees up a CPU register, and avoids a couple of instructions per function call. However, it makes stack unwinding much more inefficient, since it has to use DWARF debug information instead, and gives worse results with e.g. `perf` and eBPF profiles. The `backtrace` implementation of `libunwind` is also suspected to cause seg faults. The performance benefit of frame pointer omission doesn't appear to matter that much on modern 64-bit CPU architectures (which have plenty of registers and optimized instruction execution), and benchmarks did not show measurable overhead. The Rust standard library and jemalloc already enable frame pointers by default. For more information, see https://www.brendangregg.com/blog/2024-03-17/the-return-of-the-frame-pointers.html. Resolves #10224. Touches #10225. ## Summary of changes Enable frame pointers in all builds, and use frame pointers for pprof-rs stack sampling.
Release binaries are currently built without frame pointers. This frees up a register for the compiler and avoids a couple of instructions per function call, which can improve performance (typically <1%). However, stack unwinding and profiling then has to use DWARF information to generate backtraces, which is far more expensive and can cause difficulty e.g. for
perf
and eBPF profilers.We're considering continuous profiling, and jemalloc heap profiling already probabilistically takes stack traces during allocations. These stack traces will be much cheaper with frame pointers enabled. This might save more CPU than we lose with the dedicated frame pointer register, and allow us to profile at higher frequency.
The Rust stdlib recently enabled frame pointers by default for this reason. It's also possible that frame pointers are already enabled by default on aarch64 CPUs (used for Pageservers), since this architecture uses a dedicated frame pointer register.
Related reading:
The text was updated successfully, but these errors were encountered: