-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use consistent prefix map when building deterministic system libraries #23222
Conversation
When `deterministic_paths` is set, we are currently using `-ffile-prefix-map` to produce the same path in data and debug info. In the case of absolute paths, their emscripten path is replaced with a fake path `/emsdk/emscripten`, and in the case of relative paths, all path relative to the emscripten directory is removed, so `../../system/lib/somefile.c` becomes `system/lib/somefiles.c`. /~https://github.com/emscripten-core/emscripten/blob/f66b5d706e174d9e5cc6122c06ea29dcd2735cd0/tools/system_libs.py#L472-L477 /~https://github.com/emscripten-core/emscripten/blob/f66b5d706e174d9e5cc6122c06ea29dcd2735cd0/tools/system_libs.py#L495-L501 So this does not make relative paths and absolute paths the same, which can lead to different builds depending on whether the command line uses absolute paths vs. relative ones. Currently we use relative paths when `EMCC_BATCH_BUILD` is set. And Ninja builds cannot use relative paths. This is also what was suggested in emscripten-core#23195 (comment) while discussins `__FILE__` problem in #23915. This does not change any code size tests because no code size tests happens to contain `__FILE__` at the moment. (One will be added in emscripten-core#22994)
Looks like the test_dwarf_system_lib failure is real. Can we come up with a better PR title? Something like "Use consistent prefix map when building deterministic system libraries"? |
tools/system_libs.py
Outdated
@@ -42,6 +42,10 @@ | |||
# link time. | |||
USE_NINJA = int(os.environ.get('EMCC_USE_NINJA', '0')) | |||
|
|||
# A fake emscripten path to use in __FILE__ macro and debug info to produce | |||
# reproducible builds across platforms. | |||
FAKE_EMSCRIPTEN_PATH = '/emsdk/emscripten' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about DETERMINISITIC_PREFIX
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm % comments and tests passing
Fixed.
Done. |
'-fdebug-compilation-dir=/emsdk/emscripten'] | ||
cflags += [f'-ffile-prefix-map={relative_source_dir}={DETERMINISITIC_PREFIX}'] | ||
cflags += [f'-ffile-prefix-map={source_dir}={DETERMINISITIC_PREFIX}', | ||
f'-fdebug-compilation-dir={DETERMINISITIC_PREFIX}'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if these two chunks could be combined into self.get_cflags()
rather than duplicated? Can be part of the followup of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in #23225.
This factors duplicate routines to add deterministic prefix paths out to `Library.get_cflags()`. Suggested in emscripten-core#23222 (comment).
This factors duplicate routines to add deterministic prefix paths out to `Library.get_cflags()`. Suggested in #23222 (comment).
This updates libcxx and libcxxabi to LLVM 19.1.4: /~https://github.com/llvm/llvm-project/releases/tag/llvmorg-19.1.4 The initial update was done using `update_libcxx.py` and `update_libcxxabi.py`, and subsequent fixes were made in indidual commits. The commit history here is kind of messy because of CI testing so not all individual commits are noteworthy. Additional changes: - Build libcxx and libcxxabi with C++23: 8b0bfdf /~https://github.com/llvm/llvm-project/blob/aadaa00de76ed0c4987b97450dd638f63a385bed/libcxx/src/expected.cpp was added in llvm/llvm-project#87390 and this file assumes C++23 to be compiled. Apparently libc++ sources are always built with C++23 so they don't guard things against it in `src/`: llvm/llvm-project#87390 (comment) This commit also builds libc++abi with C++23 because it doesn't seem there's any downside to it. - Exclude newly added `compiler_rt_shims.cpp`: 5bbcbf0 We have excluded files in /~https://github.com/emscripten-core/emscripten/tree/main/system/lib/libcxx/src/support/win32. This is a new file added in this directory in llvm/llvm-project#83575. - Disable time zone support: a5f2cbe We disabled C++20 time zone support in LLVM 18 update (#21638): df9af64 The list of source files related to time zone support has changed in llvm/llvm-project#74928, so this commit reflects it. - Re-add + update `__assertion_handler` from `default_assertion_handler.in`: 41f8037 This file was added as a part of LLVM 18 update (#21638) in 8d51927 and mistakenly deleted when I ran `update_libcxx.py`. This file was copied from /~https://github.com/llvm/llvm-project/blob/aadaa00de76ed0c4987b97450dd638f63a385bed/libcxx/vendor/llvm/default_assertion_handler.in, so this also updates the file with the newest `default_assertion_handler.in`. - `_LIBCPP_PSTL_CPU_BACKEND_SERIAL` -> `_LIBCPP_PSTL_BACKEND_SERIAL`: 4b969c3 The name changed in this update, so reflecting it on our `__config_site`. - Directly include `pthread.h` from `emscripten/val.h`: a5a76c3 Due to file rearrangements happened, this was necessary. --- Other things to note: - `std::basic_string<unsigned_char>` is not supported anymore The support for `std::basic_string<unsigned_char>`, which was being used by embind, is removed in this version.#23070 removes the uses from embind. - libcxxabi uses `__FILE__` in more places llvm/llvm-project#80689 started to use `_LIBCXXABI_ASSERT`, which [uses](/~https://github.com/llvm/llvm-project/blob/aadaa00de76ed0c4987b97450dd638f63a385bed/libcxxabi/src/abort_message.h#L22) `__FILE__`, in `private_typeinfo.cpp`. `__FILE__` macro produces different paths depending on the local directory names and how the file is given to clang in the command line, and this file is included in the result of one of our code size tests, `other.test_minimal_runtime_code_size_hello_embind`, which caused the result of the test to vary depending on the CI bots and how the library is built (e.g., whether by embuilder, ninja, or neither). Even though this was brought to surface during this LLVM 19 update, `__FILE__` macro could be a problem for producing reproducible builds anyway. We discussed this problem in #23195, and the fixes landed in #23222, #23225, and #23256.
When
deterministic_paths
is set, we are currently using-ffile-prefix-map
to produce the same path in data and debug info. In the case of absolute paths, their emscripten path is replaced with a fake path/emsdk/emscripten
, and in the case of relative paths, all path relative to the emscripten directory is removed, so../../system/lib/somefile.c
becomessystem/lib/somefiles.c
.emscripten/tools/system_libs.py
Lines 472 to 477 in f66b5d7
emscripten/tools/system_libs.py
Lines 495 to 501 in f66b5d7
So this does not make relative paths and absolute paths the same, which can lead to different builds depending on whether the command line uses absolute paths vs. relative ones. Currently we use relative paths when
EMCC_BATCH_BUILD
is set. And Ninja builds cannot use relative paths.This is also what was suggested in #23195 (comment) while discussing
__FILE__
problem in #23195.This does not change any code size tests because no code size tests happens to contain
__FILE__
at the moment. (One will be added in #22994)