forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Encode WebAssembly specific locations in DBG_VALUEs and DW_AT_frame_base #2
Merged
alexcrichton
merged 2 commits into
rust-lang:rustc/8.0-2019-01-16
from
yurydelendik:rustc/frame-pointer
Jan 17, 2019
Merged
[WIP] Encode WebAssembly specific locations in DBG_VALUEs and DW_AT_frame_base #2
alexcrichton
merged 2 commits into
rust-lang:rustc/8.0-2019-01-16
from
yurydelendik:rustc/frame-pointer
Jan 17, 2019
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…lyExplicitLocals pass Extends DWARF expression larguage to express locals/globals locations. (via target-index operands) The WebAssemblyExplicitLocals can replace virtual registers to target-index() operand type at the time when WebAssembly backend introduces local.{get,set,tee} instead of corresponding virtual registers. Reviewers: aprantl, dschuff Differential Revision: https://reviews.llvm.org/D52634
I've added this to rust-lang/rust#57675 |
Merged
alexcrichton
pushed a commit
that referenced
this pull request
Jul 12, 2019
Introduction ============ This patch added intial support for bpf program compile once and run everywhere (CO-RE). The main motivation is for bpf program which depends on kernel headers which may vary between different kernel versions. The initial discussion can be found at https://lwn.net/Articles/773198/. Currently, bpf program accesses kernel internal data structure through bpf_probe_read() helper. The idea is to capture the kernel data structure to be accessed through bpf_probe_read() and relocate them on different kernel versions. On each host, right before bpf program load, the bpfloader will look at the types of the native linux through vmlinux BTF, calculates proper access offset and patch the instruction. To accommodate this, three intrinsic functions preserve_{array,union,struct}_access_index are introduced which in clang will preserve the base pointer, struct/union/array access_index and struct/union debuginfo type information. Later, bpf IR pass can reconstruct the whole gep access chains without looking at gep itself. This patch did the following: . An IR pass is added to convert preserve_*_access_index to global variable who name encodes the getelementptr access pattern. The global variable has metadata attached to describe the corresponding struct/union debuginfo type. . An SimplifyPatchable MachineInstruction pass is added to remove unnecessary loads. . The BTF output pass is enhanced to generate relocation records located in .BTF.ext section. Typical CO-RE also needs support of global variables which can be assigned to different values to different hosts. For example, kernel version can be used to guard different versions of codes. This patch added the support for patchable externals as well. Example ======= The following is an example. struct pt_regs { long arg1; long arg2; }; struct sk_buff { int i; struct net_device *dev; }; #define _(x) (__builtin_preserve_access_index(x)) static int (*bpf_probe_read)(void *dst, int size, const void *unsafe_ptr) = (void *) 4; extern __attribute__((section(".BPF.patchable_externs"))) unsigned __kernel_version; int bpf_prog(struct pt_regs *ctx) { struct net_device *dev = 0; // ctx->arg* does not need bpf_probe_read if (__kernel_version >= 41608) bpf_probe_read(&dev, sizeof(dev), _(&((struct sk_buff *)ctx->arg1)->dev)); else bpf_probe_read(&dev, sizeof(dev), _(&((struct sk_buff *)ctx->arg2)->dev)); return dev != 0; } In the above, we want to translate the third argument of bpf_probe_read() as relocations. -bash-4.4$ clang -target bpf -O2 -g -S trace.c The compiler will generate two new subsections in .BTF.ext, OffsetReloc and ExternReloc. OffsetReloc is to record the structure member offset operations, and ExternalReloc is to record the external globals where only u8, u16, u32 and u64 are supported. BPFOffsetReloc Size struct SecLOffsetReloc for ELF section #1 A number of struct BPFOffsetReloc for ELF section #1 struct SecOffsetReloc for ELF section #2 A number of struct BPFOffsetReloc for ELF section #2 ... BPFExternReloc Size struct SecExternReloc for ELF section #1 A number of struct BPFExternReloc for ELF section #1 struct SecExternReloc for ELF section #2 A number of struct BPFExternReloc for ELF section #2 struct BPFOffsetReloc { uint32_t InsnOffset; ///< Byte offset in this section uint32_t TypeID; ///< TypeID for the relocation uint32_t OffsetNameOff; ///< The string to traverse types }; struct BPFExternReloc { uint32_t InsnOffset; ///< Byte offset in this section uint32_t ExternNameOff; ///< The string for external variable }; Note that only externs with attribute section ".BPF.patchable_externs" are considered for Extern Reloc which will be patched by bpf loader right before the load. For the above test case, two offset records and one extern record will be generated: OffsetReloc records: .long .Ltmp12 # Insn Offset .long 7 # TypeId .long 242 # Type Decode String .long .Ltmp18 # Insn Offset .long 7 # TypeId .long 242 # Type Decode String ExternReloc record: .long .Ltmp5 # Insn Offset .long 165 # External Variable In string table: .ascii "0:1" # string offset=242 .ascii "__kernel_version" # string offset=165 The default member offset can be calculated as the 2nd member offset (0 representing the 1st member) of struct "sk_buff". The asm code: .Ltmp5: .Ltmp6: r2 = 0 r3 = 41608 .Ltmp7: .Ltmp8: .loc 1 18 9 is_stmt 0 # t.c:18:9 .Ltmp9: if r3 > r2 goto LBB0_2 .Ltmp10: .Ltmp11: .loc 1 0 9 # t.c:0:9 .Ltmp12: r2 = 8 .Ltmp13: .loc 1 19 66 is_stmt 1 # t.c:19:66 .Ltmp14: .Ltmp15: r3 = *(u64 *)(r1 + 0) goto LBB0_3 .Ltmp16: .Ltmp17: LBB0_2: .loc 1 0 66 is_stmt 0 # t.c:0:66 .Ltmp18: r2 = 8 .loc 1 21 66 is_stmt 1 # t.c:21:66 .Ltmp19: r3 = *(u64 *)(r1 + 8) .Ltmp20: .Ltmp21: LBB0_3: .loc 1 0 66 is_stmt 0 # t.c:0:66 r3 += r2 r1 = r10 .Ltmp22: .Ltmp23: .Ltmp24: r1 += -8 r2 = 8 call 4 For instruction .Ltmp12 and .Ltmp18, "r2 = 8", the number 8 is the structure offset based on the current BTF. Loader needs to adjust it if it changes on the host. For instruction .Ltmp5, "r2 = 0", the external variable got a default value 0, loader needs to supply an appropriate value for the particular host. Compiling to generate object code and disassemble: 0000000000000000 bpf_prog: 0: b7 02 00 00 00 00 00 00 r2 = 0 1: 7b 2a f8 ff 00 00 00 00 *(u64 *)(r10 - 8) = r2 2: b7 02 00 00 00 00 00 00 r2 = 0 3: b7 03 00 00 88 a2 00 00 r3 = 41608 4: 2d 23 03 00 00 00 00 00 if r3 > r2 goto +3 <LBB0_2> 5: b7 02 00 00 08 00 00 00 r2 = 8 6: 79 13 00 00 00 00 00 00 r3 = *(u64 *)(r1 + 0) 7: 05 00 02 00 00 00 00 00 goto +2 <LBB0_3> 0000000000000040 LBB0_2: 8: b7 02 00 00 08 00 00 00 r2 = 8 9: 79 13 08 00 00 00 00 00 r3 = *(u64 *)(r1 + 8) 0000000000000050 LBB0_3: 10: 0f 23 00 00 00 00 00 00 r3 += r2 11: bf a1 00 00 00 00 00 00 r1 = r10 12: 07 01 00 00 f8 ff ff ff r1 += -8 13: b7 02 00 00 08 00 00 00 r2 = 8 14: 85 00 00 00 04 00 00 00 call 4 Instructions #2, #5 and #8 need relocation resoutions from the loader. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D61524 llvm-svn: 365503
alexcrichton
pushed a commit
to alexcrichton/llvm-project
that referenced
this pull request
Jul 19, 2019
against CXX compiler ID instead of CRT test ID. llvm-svn: 364975
alexcrichton
pushed a commit
to alexcrichton/llvm-project
that referenced
this pull request
Jul 19, 2019
If a core file has an EFI version string which includes a UUID (similar to what it returns for the kdp KDP_KERNELVERSION packet) in the LC_IDENT or LC_NOTE 'kern ver str' load command. In that case, we should try to find the binary and dSYM for the UUID listed. The dSYM may have python code which knows how to relocate the binary to the correct address in lldb's target section load list and loads other ancillary binaries. The test case is a little involved, 1. it compiles an inferior hello world apple (a.out), 2. it compiles a program which can create a corefile manually with a specific binary's UUID encoded in it, 3. it gets the UUID of the a.out binary, 4. it creates a shell script, dsym-for-uuid.sh, which will return the full path to the a.out + a.out.dSYM when called with teh correct UUID, 5. it sets the LLDB_APPLE_DSYMFORUUID_EXECUTABLE env var before creating the lldb target, to point to this dsym-for-uuid.sh, 6. runs the create-corefile binary we compiled in step rust-lang#2, 7. loads the corefile from step rust-lang#6 into lldb, 8. verifies that lldb loaded a.out by reading the LC_NOTE load command from the corefile, calling dsym-for-uuid.sh with that UUID, got back the path to a.out and loaded it. whew! <rdar://problem/47562911> llvm-svn: 366378
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Jan 18, 2020
…F optimizing part. rust-lang#2. Summary: This patch relands D71271. The problem with D71271 is that it has cyclic dependency: CodeGen->AsmPrinter->DebugInfoDWARF->CodeGen. To avoid cyclic dependency this patch puts implementation for DWARFOptimizer into separate library: lib/DWARFLinker. Thus the difference between this patch and D71271 is in that DWARFOptimizer renamed into DWARFLinker and it`s files are put into lib/DWARFLinker. Reviewers: JDevlieghere, friss, dblaikie, aprantl Reviewed By: JDevlieghere Subscribers: thegameg, merge_guards_bot, probinson, mgorny, hiraditya, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D71839
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Jan 18, 2020
…t binding This fixes a failing testcase on Fedora 30 x86_64 (regression Fedora 29->30): PASS: ./bin/lldb ./lldb-test-build.noindex/functionalities/unwind/noreturn/TestNoreturnUnwind.test_dwarf/a.out -o 'settings set symbols.enable-external-lookup false' -o r -o bt -o quit * frame #0: 0x00007ffff7aa6e75 libc.so.6`__GI_raise + 325 frame #1: 0x00007ffff7a91895 libc.so.6`__GI_abort + 295 frame rust-lang#2: 0x0000000000401140 a.out`func_c at main.c:12:2 frame rust-lang#3: 0x000000000040113a a.out`func_b at main.c:18:2 frame rust-lang#4: 0x0000000000401134 a.out`func_a at main.c:26:2 frame rust-lang#5: 0x000000000040112e a.out`main(argc=<unavailable>, argv=<unavailable>) at main.c:32:2 frame rust-lang#6: 0x00007ffff7a92f33 libc.so.6`__libc_start_main + 243 frame rust-lang#7: 0x000000000040106e a.out`_start + 46 vs. FAIL - unrecognized abort() function: ./bin/lldb ./lldb-test-build.noindex/functionalities/unwind/noreturn/TestNoreturnUnwind.test_dwarf/a.out -o 'settings set symbols.enable-external-lookup false' -o r -o bt -o quit * frame #0: 0x00007ffff7aa6e75 libc.so.6`.annobin_raise.c + 325 frame #1: 0x00007ffff7a91895 libc.so.6`.annobin_loadmsgcat.c_end.unlikely + 295 frame rust-lang#2: 0x0000000000401140 a.out`func_c at main.c:12:2 frame rust-lang#3: 0x000000000040113a a.out`func_b at main.c:18:2 frame rust-lang#4: 0x0000000000401134 a.out`func_a at main.c:26:2 frame rust-lang#5: 0x000000000040112e a.out`main(argc=<unavailable>, argv=<unavailable>) at main.c:32:2 frame rust-lang#6: 0x00007ffff7a92f33 libc.so.6`.annobin_libc_start.c + 243 frame rust-lang#7: 0x000000000040106e a.out`.annobin_init.c.hot + 46 The extra ELF symbols are there due to Annobin (I did not investigate why this problem happened specifically since F-30 and not since F-28). It is due to: Symbol table '.dynsym' contains 2361 entries: Valu e Size Type Bind Vis Name 0000000000022769 5 FUNC LOCAL DEFAULT _nl_load_domain.cold 000000000002276e 0 NOTYPE LOCAL HIDDEN .annobin_abort.c.unlikely ... 000000000002276e 0 NOTYPE LOCAL HIDDEN .annobin_loadmsgcat.c_end.unlikely ... 000000000002276e 0 NOTYPE LOCAL HIDDEN .annobin_textdomain.c_end.unlikely 000000000002276e 548 FUNC GLOBAL DEFAULT abort 000000000002276e 548 FUNC GLOBAL DEFAULT abort@@GLIBC_2.2.5 000000000002276e 548 FUNC LOCAL DEFAULT __GI_abort 0000000000022992 0 NOTYPE LOCAL HIDDEN .annobin_abort.c_end.unlikely GDB has some more complicated preferences between overlapping and/or sharing address symbols, I have made here so far the most simple fix for this case. Differential revision: https://reviews.llvm.org/D63540
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Jan 30, 2020
Using the same strategy as c38e425. D69825 revealed (introduced?) a problem when building with ASan, and some memory leaks somewhere. More details are available in the original patch. Looks like we missed one failing tests, this patch adds the workaround to this test as well. (cherry picked from commit e174da4)
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Apr 7, 2020
Saves only 36 includes of ASTContext.h and related headers. There are two deps on ASTContext.h: - C++ method overrides iterator types (TinyPtrVector) - getting LangOptions For #1, duplicate the iterator type, which is TinyPtrVector<>::const_iterator. For rust-lang#2, add an out-of-line accessor to get the language options. Getting the ASTContext from a Decl is already an out of line method that loops over the parent DeclContexts, so if it is ever performance critical, the proper fix is to pass the context (or LangOpts) into the predicate in question. Other changes are just header fixups.
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Apr 16, 2020
Summary: Previously `AtosSymbolizer` would set the PID to examine in the constructor which is called early on during sanitizer init. This can lead to incorrect behaviour in the case of a fork() because if the symbolizer is launched in the child it will be told examine the parent process rather than the child. To fix this the PID is determined just before the symbolizer is launched. A test case is included that triggers the buggy behaviour that existed prior to this patch. The test observes the PID that `atos` was called on. It also examines the symbolized stacktrace. Prior to this patch `atos` failed to symbolize the stacktrace giving output that looked like... ``` #0 0x100fc3bb5 in __sanitizer_print_stack_trace asan_stack.cpp:86 #1 0x10490dd36 in PrintStack+0x56 (/path/to/print-stack-trace-in-code-loaded-after-fork.cpp.tmp_shared_lib.dylib:x86_64+0xd36) rust-lang#2 0x100f6f986 in main+0x4a6 (/path/to/print-stack-trace-in-code-loaded-after-fork.cpp.tmp_loader:x86_64+0x100001986) rust-lang#3 0x7fff714f1cc8 in start+0x0 (/usr/lib/system/libdyld.dylib:x86_64+0x1acc8) ``` After this patch stackframes `#1` and `rust-lang#2` are fully symbolized. This patch is also a pre-requisite refactor for rdar://problem/58789439. Reviewers: kubamracek, yln Subscribers: #sanitizers, llvm-commits Tags: #sanitizers Differential Revision: https://reviews.llvm.org/D77623
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Apr 16, 2020
Summary: crash stack: ``` lang: tools/clang/include/clang/AST/AttrImpl.inc:1490: unsigned int clang::AlignedAttr::getAlignment(clang::ASTContext &) const: Assertion `!isAlignmentDependent()' failed. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: ./bin/clang -cc1 -std=c++1y -ast-dump -frecovery-ast -fcxx-exceptions /tmp/t4.cpp 1. /tmp/t4.cpp:3:31: current parser token ';' #0 0x0000000002530cff llvm::sys::PrintStackTrace(llvm::raw_ostream&) llvm-project/llvm/lib/Support/Unix/Signals.inc:564:13 #1 0x000000000252ee30 llvm::sys::RunSignalHandlers() llvm-project/llvm/lib/Support/Signals.cpp:69:18 rust-lang#2 0x000000000253126c SignalHandler(int) llvm-project/llvm/lib/Support/Unix/Signals.inc:396:3 rust-lang#3 0x00007f86964d0520 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x13520) rust-lang#4 0x00007f8695f9ff61 raise /build/glibc-oCLvUT/glibc-2.29/signal/../sysdeps/unix/sysv/linux/raise.c:51:1 rust-lang#5 0x00007f8695f8b535 abort /build/glibc-oCLvUT/glibc-2.29/stdlib/abort.c:81:7 rust-lang#6 0x00007f8695f8b40f _nl_load_domain /build/glibc-oCLvUT/glibc-2.29/intl/loadmsgcat.c:1177:9 rust-lang#7 0x00007f8695f98b92 (/lib/x86_64-linux-gnu/libc.so.6+0x32b92) rust-lang#8 0x0000000004503d9f llvm::APInt::getZExtValue() const llvm-project/llvm/include/llvm/ADT/APInt.h:1623:5 rust-lang#9 0x0000000004503d9f clang::AlignedAttr::getAlignment(clang::ASTContext&) const llvm-project/build/tools/clang/include/clang/AST/AttrImpl.inc:1492:0 ``` Reviewers: sammccall Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D78085
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Apr 17, 2020
Bitcode file alignment is only 32-bit so 64-bit offsets need special handling. /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:6327:28: runtime error: load of misaligned address 0x7fca2bcfe54c for type 'const uint64_t' (aka 'const unsigned long'), which requires 8 byte alignment 0x7fca2bcfe54c: note: pointer points here 00 00 00 00 5a a6 01 00 00 00 00 00 19 a7 01 00 00 00 00 00 48 a7 01 00 00 00 00 00 7d a7 01 00 ^ #0 0x3be2fe4 in clang::ASTReader::TypeCursorForIndex(unsigned int) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:6327:28 #1 0x3be30a0 in clang::ASTReader::readTypeRecord(unsigned int) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:6348:24 rust-lang#2 0x3bd3d4a in clang::ASTReader::GetType(unsigned int) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:6985:26 rust-lang#3 0x3c5d9ae in clang::ASTDeclReader::Visit(clang::Decl*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReaderDecl.cpp:533:31 rust-lang#4 0x3c91cac in clang::ASTReader::ReadDeclRecord(unsigned int) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReaderDecl.cpp:4045:10 rust-lang#5 0x3bd4fb1 in clang::ASTReader::GetDecl(unsigned int) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:7352:5 rust-lang#6 0x3bce2f9 in clang::ASTReader::ReadASTBlock(clang::serialization::ModuleFile&, unsigned int) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:3625:22 rust-lang#7 0x3bd6d75 in clang::ASTReader::ReadAST(llvm::StringRef, clang::serialization::ModuleKind, clang::SourceLocation, unsigned int, llvm::SmallVectorImpl<clang::ASTReader::ImportedSubmodule>*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:4230:32 rust-lang#8 0x3a6b415 in clang::CompilerInstance::createPCHExternalASTSource(llvm::StringRef, llvm::StringRef, bool, bool, clang::Preprocessor&, clang::InMemoryModuleCache&, clang::ASTContext&, clang::PCHContainerReader const&, llvm::ArrayRef<std::shared_ptr<clang::ModuleFileExtension> >, llvm::ArrayRef<std::shared_ptr<clang::DependencyCollector> >, void*, bool, bool, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:539:19 rust-lang#9 0x3a6b00e in clang::CompilerInstance::createPCHExternalASTSource(llvm::StringRef, bool, bool, void*, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:501:18 rust-lang#10 0x3abac80 in clang::FrontendAction::BeginSourceFile(clang::CompilerInstance&, clang::FrontendInputFile const&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/FrontendAction.cpp:865:12 rust-lang#11 0x3a6e61c in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:972:13 rust-lang#12 0x3ba74bf in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:282:25 rust-lang#13 0xa3f753 in cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/cc1_main.cpp:240:15 rust-lang#14 0xa3a68a in ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/driver.cpp:330:12 rust-lang#15 0xa37f31 in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/driver.cpp:407:12 rust-lang#16 0x7fca2a7032e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0) rust-lang#17 0xa21029 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/clang-11+0xa21029) This reverts commit 30d5946.
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Apr 30, 2020
Summary: For some reason the TestExec test on the macOS bots randomly fails with this error: ``` output: * thread rust-lang#2, stop reason = EXC_BAD_ACCESS (code=1, address=0x108e66000) * frame #0: 0x0000000108e66000 [...] File "/Users/buildslave/jenkins/workspace/lldb-cmake/llvm-project/lldb/test/API/functionalities/exec/TestExec.py", line 25, in test_hitting_exec self.do_test(False) File "/Users/buildslave/jenkins/workspace/lldb-cmake/llvm-project/lldb/test/API/functionalities/exec/TestExec.py", line 113, in do_test "Stopped at breakpoint in exec'ed process.") AssertionError: False is not True : Stopped at breakpoint in exec'ed process. Config=x86_64-/Users/buildslave/jenkins/workspace/lldb-cmake/lldb-build/bin/clang-11 ``` I don't know why the test program is failing and I couldn't reproduce this problem on my own. This patch is a stab in the dark that just tries to make the test code more similar to code which we would expect in a user program to make whatever part of macOS happy that is currently not liking our code. The actual changes are: * We pass in argv[0] that is describing otherprog path instead of the current argv[0]. * We pass in a non-null envp (which anyway doesn't seem to be allowed on macOS man page). Reviewers: labath, JDevlieghere Reviewed By: labath Subscribers: abidh, lldb-commits Tags: #lldb Differential Revision: https://reviews.llvm.org/D75241
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Apr 30, 2020
Summary: In the LLVM IR, "call" instructions read memory for each byval operand. For example: ``` $ cat blah.c struct foo { void *a, *b, *c; }; struct bar { struct foo foo; }; void func1(const struct foo); void func2(struct bar *bar) { func1(bar->foo); } $ [...]/bin/clang -S -flto -c blah.c -O2 ; cat blah.s [...] define dso_local void @func2(%struct.bar* %bar) local_unnamed_addr #0 { entry: %foo = getelementptr inbounds %struct.bar, %struct.bar* %bar, i64 0, i32 0 tail call void @func1(%struct.foo* byval(%struct.foo) align 8 %foo) rust-lang#2 ret void } [...] $ [...]/bin/clang -S -c blah.c -O2 ; cat blah.s [...] func2: # @func2 [...] subq $24, %rsp [...] movq 16(%rdi), %rax movq %rax, 16(%rsp) movups (%rdi), %xmm0 movups %xmm0, (%rsp) callq func1 addq $24, %rsp [...] retq ``` Let ASAN instrument these hidden memory accesses. This is patch 4/4 of a patch series: https://reviews.llvm.org/D77616 [PATCH 1/4] [AddressSanitizer] Refactor ClDebug{Min,Max} handling https://reviews.llvm.org/D77617 [PATCH 2/4] [AddressSanitizer] Split out memory intrinsic handling https://reviews.llvm.org/D77618 [PATCH 3/4] [AddressSanitizer] Refactor: Permit >1 interesting operands per instruction https://reviews.llvm.org/D77619 [PATCH 4/4] [AddressSanitizer] Instrument byval call arguments Reviewers: kcc, glider Reviewed By: glider Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77619
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
May 13, 2020
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
May 13, 2020
… under tail-folding." This was reverted because of a miscompilation. At closer inspection, the problem was actually visible in a changed llvm regression test too. This one-line follow up fix/recommit will splat the IV, which is what we are trying to avoid if unnecessary in general, if tail-folding is requested even if all users are scalar instructions after vectorisation. Because with tail-folding, the splat IV will be used by the predicate of the masked loads/stores instructions. The previous version omitted this, which caused the miscompilation. The original commit message was: If tail-folding of the scalar remainder loop is applied, the primary induction variable is splat to a vector and used by the masked load/store vector instructions, thus the IV does not remain scalar. Because we now mark that the IV does not remain scalar for these cases, we don't emit the vector IV if it is not used. Thus, the vectoriser produces less dead code. Thanks to Ayal Zaks for the direction how to fix this.
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
May 16, 2020
Use smart pointer instead of new/delete.
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
May 21, 2020
Summary: The previous code tries to strip out parentheses and anything in between them. I'm guessing the idea here was to try to drop any listed arguments for the function being symbolized. Unfortunately this approach is broken in several ways. * Templated functions may contain parentheses. The existing approach messes up these names. * In C++ argument types are part of a function's signature for the purposes of overloading so removing them could be confusing. Fix this simply by not trying to adjust the function name that comes from `atos`. A test case is included. Without the change the test case produced output like: ``` WRITE of size 4 at 0x6060000001a0 thread T0 #0 0x10b96614d in IntWrapper<void >::operator=> const&) asan-symbolize-templated-cxx.cpp:10 #1 0x10b960b0e in void writeToA<IntWrapper<void > >>) asan-symbolize-templated-cxx.cpp:30 rust-lang#2 0x10b96bf27 in decltype>)>> >)) std::__1::__invoke<void >), IntWrapper<void > >>), IntWrapper<void >&&) type_traits:4425 rust-lang#3 0x10b96bdc1 in void std::__1::__invoke_void_return_wrapper<void>::__call<void >), IntWrapper<void > >>), IntWrapper<void >&&) __functional_base:348 rust-lang#4 0x10b96bd71 in std::__1::__function::__alloc_func<void >), std::__1::allocator<void >)>, void >)>::operator>&&) functional:1533 rust-lang#5 0x10b9684e2 in std::__1::__function::__func<void >), std::__1::allocator<void >)>, void >)>::operator>&&) functional:1707 rust-lang#6 0x10b96cd7b in std::__1::__function::__value_func<void >)>::operator>&&) const functional:1860 rust-lang#7 0x10b96cc17 in std::__1::function<void >)>::operator>) const functional:2419 rust-lang#8 0x10b960ca6 in Foo<void >), IntWrapper<void > >::doCall>) asan-symbolize-templated-cxx.cpp:44 rust-lang#9 0x10b96088b in main asan-symbolize-templated-cxx.cpp:54 rust-lang#10 0x7fff6ffdfcc8 in start (in libdyld.dylib) + 0 ``` Note how the symbol names for the frames are messed up (e.g. rust-lang#8, #1). With the patch the output looks like: ``` WRITE of size 4 at 0x6060000001a0 thread T0 #0 0x10005214d in IntWrapper<void (int)>::operator=(IntWrapper<void (int)> const&) asan-symbolize-templated-cxx.cpp:10 #1 0x10004cb0e in void writeToA<IntWrapper<void (int)> >(IntWrapper<void (int)>) asan-symbolize-templated-cxx.cpp:30 rust-lang#2 0x100057f27 in decltype(std::__1::forward<void (*&)(IntWrapper<void (int)>)>(fp)(std::__1::forward<IntWrapper<void (int)> >(fp0))) std::__1::__invoke<void (*&)(IntWrapper<void (int)>), IntWrapper<void (int)> >(void (*&)(IntWrapper<void (int)>), IntWrapper<void (int)>&&) type_traits:4425 rust-lang#3 0x100057dc1 in void std::__1::__invoke_void_return_wrapper<void>::__call<void (*&)(IntWrapper<void (int)>), IntWrapper<void (int)> >(void (*&)(IntWrapper<void (int)>), IntWrapper<void (int)>&&) __functional_base:348 rust-lang#4 0x100057d71 in std::__1::__function::__alloc_func<void (*)(IntWrapper<void (int)>), std::__1::allocator<void (*)(IntWrapper<void (int)>)>, void (IntWrapper<void (int)>)>::operator()(IntWrapper<void (int)>&&) functional:1533 rust-lang#5 0x1000544e2 in std::__1::__function::__func<void (*)(IntWrapper<void (int)>), std::__1::allocator<void (*)(IntWrapper<void (int)>)>, void (IntWrapper<void (int)>)>::operator()(IntWrapper<void (int)>&&) functional:1707 rust-lang#6 0x100058d7b in std::__1::__function::__value_func<void (IntWrapper<void (int)>)>::operator()(IntWrapper<void (int)>&&) const functional:1860 rust-lang#7 0x100058c17 in std::__1::function<void (IntWrapper<void (int)>)>::operator()(IntWrapper<void (int)>) const functional:2419 rust-lang#8 0x10004cca6 in Foo<void (IntWrapper<void (int)>), IntWrapper<void (int)> >::doCall(IntWrapper<void (int)>) asan-symbolize-templated-cxx.cpp:44 rust-lang#9 0x10004c88b in main asan-symbolize-templated-cxx.cpp:54 rust-lang#10 0x7fff6ffdfcc8 in start (in libdyld.dylib) + 0 ``` rdar://problem/58887175 Reviewers: kubamracek, yln Subscribers: #sanitizers, llvm-commits Tags: #sanitizers Differential Revision: https://reviews.llvm.org/D79597
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Jun 12, 2020
Summary: crash stack: ``` llvm-project/clang/lib/AST/ASTContext.cpp:2248: clang::TypeInfo clang::ASTContext::getTypeInfoImpl(const clang::Type *) const: Assertion `!A->getDeducedType().isNull() && "cannot request the size of an undeduced or dependent auto type"' failed. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: #0 0x00000000025bb0bf llvm::sys::PrintStackTrace(llvm::raw_ostream&) llvm-project/llvm/lib/Support/Unix/Signals.inc:564:13 #1 0x00000000025b92b0 llvm::sys::RunSignalHandlers() llvm-project/llvm/lib/Support/Signals.cpp:69:18 rust-lang#2 0x00000000025bb535 SignalHandler(int) llvm-project/llvm/lib/Support/Unix/Signals.inc:396:3 rust-lang#3 0x00007f9ef9298110 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14110) rust-lang#4 0x00007f9ef8d72761 raise /build/glibc-M65Gwz/glibc-2.30/signal/../sysdeps/unix/sysv/linux/raise.c:51:1 rust-lang#5 0x00007f9ef8d5c55b abort /build/glibc-M65Gwz/glibc-2.30/stdlib/abort.c:81:7 rust-lang#6 0x00007f9ef8d5c42f get_sysdep_segment_value /build/glibc-M65Gwz/glibc-2.30/intl/loadmsgcat.c:509:8 rust-lang#7 0x00007f9ef8d5c42f _nl_load_domain /build/glibc-M65Gwz/glibc-2.30/intl/loadmsgcat.c:970:34 rust-lang#8 0x00007f9ef8d6b092 (/lib/x86_64-linux-gnu/libc.so.6+0x34092) rust-lang#9 0x000000000458abe0 clang::ASTContext::getTypeInfoImpl(clang::Type const*) const llvm-project/clang/lib/AST/ASTContext.cpp:0:5 ``` Reviewers: sammccall Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D81384
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Jul 17, 2020
… Solaris/x86 A dozen 32-bit `AddressSanitizer` testcases FAIL on the latest beta of Solaris 11.4/x86, e.g. `AddressSanitizer-i386-sunos :: TestCases/null_deref.cpp` produces AddressSanitizer:DEADLYSIGNAL ================================================================= ==29274==ERROR: AddressSanitizer: stack-overflow on address 0x00000028 (pc 0x08135efd bp 0xfeffdfd8 sp 0x00000000 T0) #0 0x8135efd in NullDeref(int*) /vol/llvm/src/llvm-project/dist/compiler-rt/test/asan/TestCases/null_deref.cpp:15:10 #1 0x8135ea6 in main /vol/llvm/src/llvm-project/dist/compiler-rt/test/asan/TestCases/null_deref.cpp:21:3 rust-lang#2 0x8084b85 in _start (null_deref.cpp.tmp+0x8084b85) SUMMARY: AddressSanitizer: stack-overflow /vol/llvm/src/llvm-project/dist/compiler-rt/test/asan/TestCases/null_deref.cpp:15:10 in NullDeref(int*) ==29274==ABORTING instead of the expected AddressSanitizer:DEADLYSIGNAL ================================================================= ==29276==ERROR: AddressSanitizer: SEGV on unknown address 0x00000028 (pc 0x08135f1f bp 0xfeffdf48 sp 0xfeffdf40 T0) ==29276==The signal is caused by a WRITE memory access. ==29276==Hint: address points to the zero page. #0 0x8135f1f in NullDeref(int*) /vol/llvm/src/llvm-project/local/compiler-rt/test/asan/TestCases/null_deref.cpp:15:10 #1 0x8135efa in main /vol/llvm/src/llvm-project/local/compiler-rt/test/asan/TestCases/null_deref.cpp:21:3 rust-lang#2 0x8084be5 in _start (null_deref.cpp.tmp+0x8084be5) AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV /vol/llvm/src/llvm-project/local/compiler-rt/test/asan/TestCases/null_deref.cpp:15:10 in NullDeref(int*) ==29276==ABORTING I managed to trace this to a change in `<sys/regset.h>`: previously the header would primarily define the short register indices (like `UESP`). While they are required by the i386 psABI, they are only required in `<ucontext.h>` and could previously leak into unsuspecting user code, polluting the namespace and requiring elaborate workarounds like that in `llvm/include/llvm/Support/Solaris/sys/regset.h`. The change fixed that by restricting the definition of the short forms appropriately, at the same time defining all `REG_` prefixed forms for compatiblity with other systems. This exposed a bug in `compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp`, however: Previously, the index for the user stack pointer would be hardcoded if `REG_ESP` wasn't defined. Now with that definition present, it turned out that `REG_ESP` was the wrong index to use: the previous value 17 (and `REG_SP`) corresponds to `REG_UESP` instead. With that change, the failures are all gone. Tested on `amd-pc-solaris2.11`. Differential Revision: https://reviews.llvm.org/D83664
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Aug 8, 2020
…RM` was undefined after definition. `PP->getMacroInfo()` returns nullptr for undefined macro, which leads to null-dereference at `MI->tockens().back()`. Stack dump: ``` #0 0x000000000217d15a llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/llvm-project/build/bin/clang-tidy+0x217d15a) #1 0x000000000217b17c llvm::sys::RunSignalHandlers() (/llvm-project/build/bin/clang-tidy+0x217b17c) rust-lang#2 0x000000000217b2e3 SignalHandler(int) (/llvm-project/build/bin/clang-tidy+0x217b2e3) rust-lang#3 0x00007f39be5b1390 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x11390) rust-lang#4 0x0000000000593532 clang::tidy::bugprone::BadSignalToKillThreadCheck::check(clang::ast_matchers::MatchFinder::MatchResult const&) (/llvm-project/build/bin/clang-tidy+0x593532) ``` Reviewed By: hokein Differential Revision: https://reviews.llvm.org/D85401
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Aug 8, 2020
…RM` is not a literal. If `SIGTERM` is not a literal (e.g. `#define SIGTERM ((unsigned)15)`) bugprone-bad-signal-to-kill-thread check crashes. Stack dump: ``` #0 0x000000000217d15a llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/llvm-project/build/bin/clang-tidy+0x217d15a) #1 0x000000000217b17c llvm::sys::RunSignalHandlers() (/llvm-project/build/bin/clang-tidy+0x217b17c) rust-lang#2 0x000000000217b2e3 SignalHandler(int) (/llvm-project/build/bin/clang-tidy+0x217b2e3) rust-lang#3 0x00007f6a7efb1390 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x11390) rust-lang#4 0x000000000212ac9b llvm::StringRef::getAsInteger(unsigned int, llvm::APInt&) const (/llvm-project/build/bin/clang-tidy+0x212ac9b) rust-lang#5 0x0000000000593501 clang::tidy::bugprone::BadSignalToKillThreadCheck::check(clang::ast_matchers::MatchFinder::MatchResult const&) (/llvm-project/build/bin/clang-tidy+0x593501) ``` Reviewed By: hokein Differential Revision: https://reviews.llvm.org/D85398
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Aug 20, 2020
… when `__STDC_WANT_LIB_EXT1__` was undefined after definition. PP->getMacroInfo() returns nullptr for undefined macro, so we need to check this return value before dereference. Stack dump: ``` #0 0x0000000002185e6a llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/llvm-project/build/bin/clang-tidy+0x2185e6a) #1 0x0000000002183e8c llvm::sys::RunSignalHandlers() (/llvm-project/build/bin/clang-tidy+0x2183e8c) rust-lang#2 0x0000000002183ff3 SignalHandler(int) (/llvm-project/build/bin/clang-tidy+0x2183ff3) rust-lang#3 0x00007f37df9b1390 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x11390) rust-lang#4 0x000000000052054e clang::tidy::bugprone::NotNullTerminatedResultCheck::check(clang::ast_matchers::MatchFinder::MatchResult const&) (/llvm-project/build/bin/clang-tidy+0x52054e) ``` Reviewed By: hokein Differential Revision: https://reviews.llvm.org/D85523
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Jul 8, 2024
This test is currently flaky on a local Windows amd64 build. The reason is that it relies on the order of `process.threads` but this order is nondeterministic: If we print lldb's inputs and outputs while running, we can see that the breakpoints are always being set correctly, and always being hit: ```sh runCmd: breakpoint set -f "main.c" -l 2 output: Breakpoint 1: where = a.out`func_inner + 1 at main.c:2:9, address = 0x0000000140001001 runCmd: breakpoint set -f "main.c" -l 7 output: Breakpoint 2: where = a.out`main + 17 at main.c:7:5, address = 0x0000000140001021 runCmd: run output: Process 52328 launched: 'C:\workspace\llvm-project\llvm\build\lldb-test-build.noindex\functionalities\unwind\zeroth_frame\TestZerothFrame.test_dwarf\a.out' (x86_64) Process 52328 stopped * thread #1, stop reason = breakpoint 1.1 frame #0: 0x00007ff68f6b1001 a.out`func_inner at main.c:2:9 1 void func_inner() { -> 2 int a = 1; // Set breakpoint 1 here ^ 3 } 4 5 int main() { 6 func_inner(); 7 return 0; // Set breakpoint 2 here ``` However, sometimes the backtrace printed in this test shows that the process is stopped inside NtWaitForWorkViaWorkerFactory from `ntdll.dll`: ```sh Backtrace at the first breakpoint: frame #0: 0x00007ffecc7b3bf4 ntdll.dll`NtWaitForWorkViaWorkerFactory + 20 frame #1: 0x00007ffecc74585e ntdll.dll`RtlClearThreadWorkOnBehalfTicket + 862 frame rust-lang#2: 0x00007ffecc3e257d kernel32.dll`BaseThreadInitThunk + 29 frame rust-lang#3: 0x00007ffecc76af28 ntdll.dll`RtlUserThreadStart + 40 ``` When this happens, the test fails with an assertion error that the stopped thread's zeroth frame's current line number does not match the expected line number. This is because the test is looking at the wrong thread: `process.threads[0]`. If we print the list of threads each time the test is run, we notice that threads are sometimes in a different order, within `process.threads`: ```sh Thread 0: thread rust-lang#4: tid = 0x9c38, 0x00007ffecc7b3bf4 ntdll.dll`NtWaitForWorkViaWorkerFactory + 20 Thread 1: thread rust-lang#2: tid = 0xa950, 0x00007ffecc7b3bf4 ntdll.dll`NtWaitForWorkViaWorkerFactory + 20 Thread 2: thread #1: tid = 0xab18, 0x00007ff64bc81001 a.out`func_inner at main.c:2:9, stop reason = breakpoint 1.1 Thread 3: thread rust-lang#3: tid = 0xc514, 0x00007ffecc7b3bf4 ntdll.dll`NtWaitForWorkViaWorkerFactory + 20 Thread 0: thread rust-lang#3: tid = 0x018c, 0x00007ffecc7b3bf4 ntdll.dll`NtWaitForWorkViaWorkerFactory + 20 Thread 1: thread #1: tid = 0x85c8, 0x00007ff7130c1001 a.out`func_inner at main.c:2:9, stop reason = breakpoint 1.1 Thread 2: thread rust-lang#2: tid = 0xf344, 0x00007ffecc7b3bf4 ntdll.dll`NtWaitForWorkViaWorkerFactory + 20 Thread 3: thread rust-lang#4: tid = 0x6a50, 0x00007ffecc7b3bf4 ntdll.dll`NtWaitForWorkViaWorkerFactory + 20 ``` Use `self.thread()` to consistently select the correct thread, instead. Co-authored-by: kendal <kendal@thebrowser.company>
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Jul 9, 2024
…izations of function templates to USRGenerator (llvm#98027) Given the following: ``` template<typename T> struct A { void f(int); // #1 template<typename U> void f(U); // rust-lang#2 template<> void f<int>(int); // rust-lang#3 }; ``` Clang will generate the same USR for `#1` and `rust-lang#2`. This patch fixes the issue by including the template arguments of dependent class scope explicit specializations in their USRs.
wesleywiser
pushed a commit
to wesleywiser/llvm-project
that referenced
this pull request
Jul 17, 2024
This patch adds a frame recognizer for Clang's `__builtin_verbose_trap`, which behaves like a `__builtin_trap`, but emits a failure-reason string into debug-info in order for debuggers to display it to a user. The frame recognizer triggers when we encounter a frame with a function name that begins with `__clang_trap_msg`, which is the magic prefix Clang emits into debug-info for verbose traps. Once such frame is encountered we display the frame function name as the `Stop Reason` and display that frame to the user. Example output: ``` (lldb) run warning: a.out was compiled with optimization - stepping may behave oddly; variables may not be available. Process 35942 launched: 'a.out' (arm64) Process 35942 stopped * thread rust-lang#1, queue = 'com.apple.main-thread', stop reason = Misc.: Function is not implemented frame rust-lang#1: 0x0000000100003fa4 a.out`main [inlined] Dummy::func(this=<unavailable>) at verbose_trap.cpp:3:5 [opt] 1 struct Dummy { 2 void func() { -> 3 __builtin_verbose_trap("Misc.", "Function is not implemented"); 4 } 5 }; 6 7 int main() { (lldb) bt * thread rust-lang#1, queue = 'com.apple.main-thread', stop reason = Misc.: Function is not implemented frame #0: 0x0000000100003fa4 a.out`main [inlined] __clang_trap_msg$Misc.$Function is not implemented$ at verbose_trap.cpp:0 [opt] * frame rust-lang#1: 0x0000000100003fa4 a.out`main [inlined] Dummy::func(this=<unavailable>) at verbose_trap.cpp:3:5 [opt] frame rust-lang#2: 0x0000000100003fa4 a.out`main at verbose_trap.cpp:8:13 [opt] frame rust-lang#3: 0x0000000189d518b4 dyld`start + 1988 ```
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Jul 25, 2024
…linux (llvm#99613) Examples of the output: ARM: ``` # ./a.out AddressSanitizer:DEADLYSIGNAL ================================================================= ==122==ERROR: AddressSanitizer: SEGV on unknown address 0x0000007a (pc 0x76e13ac0 bp 0x7eb7fd00 sp 0x7eb7fcc8 T0) ==122==The signal is caused by a READ memory access. ==122==Hint: address points to the zero page. #0 0x76e13ac0 (/lib/libc.so.6+0x7cac0) #1 0x76dce680 in gsignal (/lib/libc.so.6+0x37680) rust-lang#2 0x005c2250 (/root/a.out+0x145250) rust-lang#3 0x76db982c (/lib/libc.so.6+0x2282c) rust-lang#4 0x76db9918 in __libc_start_main (/lib/libc.so.6+0x22918) ==122==Register values: r0 = 0x00000000 r1 = 0x0000007a r2 = 0x0000000b r3 = 0x76d95020 r4 = 0x0000007a r5 = 0x00000001 r6 = 0x005dcc5c r7 = 0x0000010c r8 = 0x0000000b r9 = 0x76f9ece0 r10 = 0x00000000 r11 = 0x7eb7fd00 r12 = 0x76dce670 sp = 0x7eb7fcc8 lr = 0x76e13ab4 pc = 0x76e13ac0 AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV (/lib/libc.so.6+0x7cac0) ==122==ABORTING ``` AArch64: ``` # ./a.out UndefinedBehaviorSanitizer:DEADLYSIGNAL ==99==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 0x000000000063 (pc 0x007fbbbc5860 bp 0x007fcfdcb700 sp 0x007fcfdcb700 T99) ==99==The signal is caused by a UNKNOWN memory access. ==99==Hint: address points to the zero page. #0 0x007fbbbc5860 (/lib64/libc.so.6+0x82860) #1 0x007fbbb81578 (/lib64/libc.so.6+0x3e578) rust-lang#2 0x00556051152c (/root/a.out+0x3152c) rust-lang#3 0x007fbbb6e268 (/lib64/libc.so.6+0x2b268) rust-lang#4 0x007fbbb6e344 (/lib64/libc.so.6+0x2b344) rust-lang#5 0x0055604e45ec (/root/a.out+0x45ec) ==99==Register values: x0 = 0x0000000000000000 x1 = 0x0000000000000063 x2 = 0x000000000000000b x3 = 0x0000007fbbb41440 x4 = 0x0000007fbbb41580 x5 = 0x3669288942d44cce x6 = 0x0000000000000000 x7 = 0x00000055605110b0 x8 = 0x0000000000000083 x9 = 0x0000000000000000 x10 = 0x0000000000000000 x11 = 0x0000000000000000 x12 = 0x0000007fbbdb3360 x13 = 0x0000000000010000 x14 = 0x0000000000000039 x15 = 0x00000000004113a0 x16 = 0x0000007fbbb81560 x17 = 0x0000005560540138 x18 = 0x000000006474e552 x19 = 0x0000000000000063 x20 = 0x0000000000000001 x21 = 0x000000000000000b x22 = 0x0000005560511510 x23 = 0x0000007fcfdcb918 x24 = 0x0000007fbbdb1b50 x25 = 0x0000000000000000 x26 = 0x0000007fbbdb2000 x27 = 0x000000556053f858 x28 = 0x0000000000000000 fp = 0x0000007fcfdcb700 lr = 0x0000007fbbbc584c sp = 0x0000007fcfdcb700 UndefinedBehaviorSanitizer can not provide additional info. SUMMARY: UndefinedBehaviorSanitizer: SEGV (/lib64/libc.so.6+0x82860) ==99==ABORTING ```
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Aug 5, 2024
``` UBSan-Standalone-sparc :: TestCases/Misc/Linux/diag-stacktrace.cpp ``` `FAIL`s on 32 and 64-bit Linux/sparc64 (and on Solaris/sparcv9, too: the test isn't Linux-specific at all). With `UBSAN_OPTIONS=fast_unwind_on_fatal=1`, the stack trace shows a duplicate innermost frame: ``` compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:31: runtime error: execution reached the end of a value-returning function without returning a value #0 0x7003a708 in f() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:35 #1 0x7003a708 in f() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:35 rust-lang#2 0x7003a714 in g() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:17:38 ``` which isn't seen with `fast_unwind_on_fatal=0`. This turns out to be another fallout from fixing `__builtin_return_address`/`__builtin_extract_return_addr` on SPARC. In `sanitizer_stacktrace_sparc.cpp` (`BufferedStackTrace::UnwindFast`) the `pc` arg is the return address, while `pc1` from the stack frame (`fr_savpc`) is the address of the `call` insn, leading to a double entry for the innermost frame in `trace_buffer[]`. This patch fixes this by moving the adjustment before all uses. Tested on `sparc64-unknown-linux-gnu` and `sparcv9-sun-solaris2.11` (with the `ubsan/TestCases/Misc/Linux` tests enabled).
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Aug 5, 2024
``` UBSan-Standalone-sparc :: TestCases/Misc/Linux/diag-stacktrace.cpp ``` `FAIL`s on 32 and 64-bit Linux/sparc64 (and on Solaris/sparcv9, too: the test isn't Linux-specific at all). With `UBSAN_OPTIONS=fast_unwind_on_fatal=1`, the stack trace shows a duplicate innermost frame: ``` compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:31: runtime error: execution reached the end of a value-returning function without returning a value #0 0x7003a708 in f() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:35 #1 0x7003a708 in f() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:35 rust-lang#2 0x7003a714 in g() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:17:38 ``` which isn't seen with `fast_unwind_on_fatal=0`. This turns out to be another fallout from fixing `__builtin_return_address`/`__builtin_extract_return_addr` on SPARC. In `sanitizer_stacktrace_sparc.cpp` (`BufferedStackTrace::UnwindFast`) the `pc` arg is the return address, while `pc1` from the stack frame (`fr_savpc`) is the address of the `call` insn, leading to a double entry for the innermost frame in `trace_buffer[]`. This patch fixes this by moving the adjustment before all uses. Tested on `sparc64-unknown-linux-gnu` and `sparcv9-sun-solaris2.11` (with the `ubsan/TestCases/Misc/Linux` tests enabled). (cherry picked from commit 3368a32)
wesleywiser
pushed a commit
to wesleywiser/llvm-project
that referenced
this pull request
Aug 21, 2024
…lvm#104148) `hasOperands` does not always execute matchers in the order they are written. This can cause issue in code using bindings when one operand matcher is relying on a binding set by the other. With this change, the first matcher present in the code is always executed first and any binding it sets are available to the second matcher. Simple example with current version (1 match) and new version (2 matches): ```bash > cat tmp.cpp int a = 13; int b = ((int) a) - a; int c = a - ((int) a); > clang-query tmp.cpp clang-query> set traversal IgnoreUnlessSpelledInSource clang-query> m binaryOperator(hasOperands(cStyleCastExpr(has(declRefExpr(hasDeclaration(valueDecl().bind("d"))))), declRefExpr(hasDeclaration(valueDecl(equalsBoundNode("d")))))) Match rust-lang#1: tmp.cpp:1:1: note: "d" binds here int a = 13; ^~~~~~~~~~ tmp.cpp:2:9: note: "root" binds here int b = ((int)a) - a; ^~~~~~~~~~~~ 1 match. > ./build/bin/clang-query tmp.cpp clang-query> set traversal IgnoreUnlessSpelledInSource clang-query> m binaryOperator(hasOperands(cStyleCastExpr(has(declRefExpr(hasDeclaration(valueDecl().bind("d"))))), declRefExpr(hasDeclaration(valueDecl(equalsBoundNode("d")))))) Match rust-lang#1: tmp.cpp:1:1: note: "d" binds here 1 | int a = 13; | ^~~~~~~~~~ tmp.cpp:2:9: note: "root" binds here 2 | int b = ((int)a) - a; | ^~~~~~~~~~~~ Match rust-lang#2: tmp.cpp:1:1: note: "d" binds here 1 | int a = 13; | ^~~~~~~~~~ tmp.cpp:3:9: note: "root" binds here 3 | int c = a - ((int)a); | ^~~~~~~~~~~~ 2 matches. ``` If this should be documented or regression tested anywhere please let me know where.
wesleywiser
pushed a commit
to wesleywiser/llvm-project
that referenced
this pull request
Aug 21, 2024
…104523) Compilers and language runtimes often use helper functions that are fundamentally uninteresting when debugging anything but the compiler/runtime itself. This patch introduces a user-extensible mechanism that allows for these frames to be hidden from backtraces and automatically skipped over when navigating the stack with `up` and `down`. This does not affect the numbering of frames, so `f <N>` will still provide access to the hidden frames. The `bt` output will also print a hint that frames have been hidden. My primary motivation for this feature is to hide thunks in the Swift programming language, but I'm including an example recognizer for `std::function::operator()` that I wished for myself many times while debugging LLDB. rdar://126629381 Example output. (Yes, my proof-of-concept recognizer could hide even more frames if we had a method that returned the function name without the return type or I used something that isn't based off regex, but it's really only meant as an example). before: ``` (lldb) thread backtrace --filtered=false * thread rust-lang#1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 * frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10 frame rust-lang#1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25 frame rust-lang#2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12 frame rust-lang#3: 0x0000000100003968 a.out`std::__1::__function::__alloc_func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()[abi:se200000](this=0x000000016fdff280, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:171:12 frame rust-lang#4: 0x00000001000026bc a.out`std::__1::__function::__func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()(this=0x000000016fdff278, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:313:10 frame rust-lang#5: 0x0000000100003c38 a.out`std::__1::__function::__value_func<int (int, int)>::operator()[abi:se200000](this=0x000000016fdff278, __args=0x000000016fdff224, __args=0x000000016fdff220) const at function.h:430:12 frame rust-lang#6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10 frame rust-lang#7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10 frame rust-lang#8: 0x0000000183cdf154 dyld`start + 2476 (lldb) ``` after ``` (lldb) bt * thread rust-lang#1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 * frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10 frame rust-lang#1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25 frame rust-lang#2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12 frame rust-lang#6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10 frame rust-lang#7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10 frame rust-lang#8: 0x0000000183cdf154 dyld`start + 2476 Note: Some frames were hidden by frame recognizers ```
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Aug 26, 2024
) Currently, process of replacing bitwise operations consisting of `LSR`/`LSL` with `And` is performed by `DAGCombiner`. However, in certain cases, the `AND` generated by this process can be removed. Consider following case: ``` lsr x8, x8, rust-lang#56 and x8, x8, #0xfc ldr w0, [x2, x8] ret ``` In this case, we can remove the `AND` by changing the target of `LDR` to `[X2, X8, LSL rust-lang#2]` and right-shifting amount change to 56 to 58. after changed: ``` lsr x8, x8, rust-lang#58 ldr w0, [x2, x8, lsl rust-lang#2] ret ``` This patch checks to see if the `SHIFTING` + `AND` operation on load target can be optimized and optimizes it if it can.
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Aug 28, 2024
`JITDylibSearchOrderResolver` local variable can be destroyed before completion of all callbacks. Capture it together with `Deps` in `OnEmitted` callback. Original error: ``` ==2035==ERROR: AddressSanitizer: stack-use-after-return on address 0x7bebfa155b70 at pc 0x7ff2a9a88b4a bp 0x7bec08d51980 sp 0x7bec08d51978 READ of size 8 at 0x7bebfa155b70 thread T87 (tf_xla-cpu-llvm) #0 0x7ff2a9a88b49 in operator() llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:58 #1 0x7ff2a9a88b49 in __invoke<(lambda at llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:9) &, const llvm::DenseMap<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> >, llvm::DenseMapInfo<llvm::orc::JITDylib *, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> > > > &> libcxx/include/__type_traits/invoke.h:149:25 rust-lang#2 0x7ff2a9a88b49 in __call<(lambda at llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:9) &, const llvm::DenseMap<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> >, llvm::DenseMapInfo<llvm::orc::JITDylib *, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> > > > &> libcxx/include/__type_traits/invoke.h:224:5 rust-lang#3 0x7ff2a9a88b49 in operator() libcxx/include/__functional/function.h:210:12 rust-lang#4 0x7ff2a9a88b49 in void std::__u::__function::__policy_invoker<void (llvm::DenseMap<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, ```
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Sep 2, 2024
Static destructor can race with calls to notify and trigger tsan warning. ``` WARNING: ThreadSanitizer: data race (pid=5787) Write of size 1 at 0x55bec9df8de8 by thread T23: #0 pthread_mutex_destroy [third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1344](third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp?l=1344&cl=669089572):3 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x1b12affb) (BuildId: ff25ace8b17d9863348bb1759c47246c) #1 __libcpp_recursive_mutex_destroy [third_party/crosstool/v18/stable/src/libcxx/include/__thread/support/pthread.h:91](third_party/crosstool/v18/stable/src/libcxx/include/__thread/support/pthread.h?l=91&cl=669089572):10 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x4523d4e9) (BuildId: ff25ace8b17d9863348bb1759c47246c) rust-lang#2 std::__tsan::recursive_mutex::~recursive_mutex() [third_party/crosstool/v18/stable/src/libcxx/src/mutex.cpp:52](third_party/crosstool/v18/stable/src/libcxx/src/mutex.cpp?l=52&cl=669089572):11 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x4523d4e9) rust-lang#3 ~SmartMutex [third_party/llvm/llvm-project/llvm/include/llvm/Support/Mutex.h:28](third_party/llvm/llvm-project/llvm/include/llvm/Support/Mutex.h?l=28&cl=669089572):11 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcaedfe) (BuildId: ff25ace8b17d9863348bb1759c47246c) rust-lang#4 (anonymous namespace)::PerfJITEventListener::~PerfJITEventListener() [third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp:65](third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp?l=65&cl=669089572):3 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcaedfe) rust-lang#5 cxa_at_exit_callback_installed_at(void*) [third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:437](third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp?l=437&cl=669089572):3 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x1b172cb9) (BuildId: ff25ace8b17d9863348bb1759c47246c) rust-lang#6 llvm::JITEventListener::createPerfJITEventListener() [third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp:496](third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp?l=496&cl=669089572):3 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcad8f5) (BuildId: ff25ace8b17d9863348bb1759c47246c) ``` ``` Previous atomic read of size 1 at 0x55bec9df8de8 by thread T192 (mutexes: write M0, write M1): #0 pthread_mutex_unlock [third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1387](third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp?l=1387&cl=669089572):3 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x1b12b6bb) (BuildId: ff25ace8b17d9863348bb1759c47246c) #1 __libcpp_recursive_mutex_unlock [third_party/crosstool/v18/stable/src/libcxx/include/__thread/support/pthread.h:87](third_party/crosstool/v18/stable/src/libcxx/include/__thread/support/pthread.h?l=87&cl=669089572):10 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x4523d589) (BuildId: ff25ace8b17d9863348bb1759c47246c) rust-lang#2 std::__tsan::recursive_mutex::unlock() [third_party/crosstool/v18/stable/src/libcxx/src/mutex.cpp:64](third_party/crosstool/v18/stable/src/libcxx/src/mutex.cpp?l=64&cl=669089572):11 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x4523d589) rust-lang#3 unlock [third_party/llvm/llvm-project/llvm/include/llvm/Support/Mutex.h:47](third_party/llvm/llvm-project/llvm/include/llvm/Support/Mutex.h?l=47&cl=669089572):16 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcaf968) (BuildId: ff25ace8b17d9863348bb1759c47246c) rust-lang#4 ~lock_guard [third_party/crosstool/v18/stable/src/libcxx/include/__mutex/lock_guard.h:39](third_party/crosstool/v18/stable/src/libcxx/include/__mutex/lock_guard.h?l=39&cl=669089572):101 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcaf968) rust-lang#5 (anonymous namespace)::PerfJITEventListener::notifyObjectLoaded(unsigned long, llvm::object::ObjectFile const&, llvm::RuntimeDyld::LoadedObjectInfo const&) [third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp:290](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp?l=290&cl=669089572):1 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcaf968) rust-lang#6 llvm::orc::RTDyldObjectLinkingLayer::onObjEmit(llvm::orc::MaterializationResponsibility&, llvm::object::OwningBinary<llvm::object::ObjectFile>, std::__tsan::unique_ptr<llvm::RuntimeDyld::MemoryManager, std::__tsan::default_delete<llvm::RuntimeDyld::MemoryManager>>, std::__tsan::unique_ptr<llvm::RuntimeDyld::LoadedObjectInfo, std::__tsan::default_delete<llvm::RuntimeDyld::LoadedObjectInfo>>, std::__tsan::unique_ptr<llvm::DenseMap<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>>, llvm::DenseMapInfo<llvm::orc::JITDylib*, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>>>>, std::__tsan::default_delete<llvm::DenseMap<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>>, llvm::DenseMapInfo<llvm::orc::JITDylib*, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>>>>>>, llvm::Error) [third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:386](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp?l=386&cl=669089572):10 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bc404a8) (BuildId: ff25ace8b17d9863348bb1759c47246c) ```
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Sep 9, 2024
…llvm#94981) This extends default argument deduction to cover class templates as well, applying only to partial ordering, adding to the provisional wording introduced in llvm#89807. This solves some ambuguity introduced in P0522 regarding how template template parameters are partially ordered, and should reduce the negative impact of enabling `-frelaxed-template-template-args` by default. Given the following example: ```C++ template <class T1, class T2 = float> struct A; template <class T3> struct B; template <template <class T4> class TT1, class T5> struct B<TT1<T5>>; // #1 template <class T6, class T7> struct B<A<T6, T7>>; // rust-lang#2 template struct B<A<int>>; ``` Prior to P0522, `rust-lang#2` was picked. Afterwards, this became ambiguous. This patch restores the pre-P0522 behavior, `rust-lang#2` is picked again.
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Sep 16, 2024
When SPARC Asan testing is enabled by PR llvm#107405, many Linux/sparc64 tests just hang like ``` #0 0xf7ae8e90 in syscall () from /usr/lib32/libc.so.6 #1 0x701065e8 in __sanitizer::FutexWait(__sanitizer::atomic_uint32_t*, unsigned int) () at compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp:766 rust-lang#2 0x70107c90 in Wait () at compiler-rt/lib/sanitizer_common/sanitizer_mutex.cpp:35 rust-lang#3 0x700f7cac in Lock () at compiler-rt/lib/asan/../sanitizer_common/sanitizer_mutex.h:196 rust-lang#4 Lock () at compiler-rt/lib/asan/../sanitizer_common/sanitizer_thread_registry.h:98 rust-lang#5 LockThreads () at compiler-rt/lib/asan/asan_thread.cpp:489 rust-lang#6 0x700e9c8c in __asan::BeforeFork() () at compiler-rt/lib/asan/asan_posix.cpp:157 rust-lang#7 0xf7ac83f4 in ?? () from /usr/lib32/libc.so.6 Backtrace stopped: previous frame identical to this frame (corrupt stack?) ``` It turns out that this happens in tests using `internal_fork` (e.g. invoking `llvm-symbolizer`): unlike most other Linux targets, which use `clone`, Linux/sparc64 has to use `__fork` instead. While `clone` doesn't trigger `pthread_atfork` handlers, `__fork` obviously does, causing the hang. To avoid this, this patch disables `InstallAtForkHandler` and lets the ASan tests run to completion. Tested on `sparc64-unknown-linux-gnu`.
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Sep 19, 2024
…ap (llvm#108825) This attempts to improve user-experience when LLDB stops on a verbose_trap. Currently if a `__builtin_verbose_trap` triggers, we display the first frame above the call to the verbose_trap. So in the newly added test case, we would've previously stopped here: ``` (lldb) run Process 28095 launched: '/Users/michaelbuch/a.out' (arm64) Process 28095 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = Bounds error: out-of-bounds access frame #1: 0x0000000100003f5c a.out`std::__1::vector<int>::operator[](this=0x000000016fdfebef size=0, (null)=10) at verbose_trap.cpp:6:9 3 template <typename T> 4 struct vector { 5 void operator[](unsigned) { -> 6 __builtin_verbose_trap("Bounds error", "out-of-bounds access"); 7 } 8 }; ``` After this patch, we would stop in the first non-`std` frame: ``` (lldb) run Process 27843 launched: '/Users/michaelbuch/a.out' (arm64) Process 27843 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = Bounds error: out-of-bounds access frame rust-lang#2: 0x0000000100003f44 a.out`g() at verbose_trap.cpp:14:5 11 12 void g() { 13 std::vector<int> v; -> 14 v[10]; 15 } 16 ``` rdar://134490328
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Oct 1, 2024
…ext is not fully initialized (llvm#110481) As this comment around target initialization implies: ``` // This can be NULL if we don't know anything about the architecture or if // the target for an architecture isn't enabled in the llvm/clang that we // built ``` There are cases where we might fail to call `InitBuiltinTypes` when creating the backing `ASTContext` for a `TypeSystemClang`. If that happens, the builtins `QualType`s, e.g., `VoidPtrTy`/`IntTy`/etc., are not initialized and dereferencing them as we do in `GetBuiltinTypeForEncodingAndBitSize` (and other places) will lead to nullptr-dereferences. Example backtrace: ``` (lldb) run Assertion failed: (!isNull() && "Cannot retrieve a NULL type pointer"), function getCommonPtr, file Type.h, line 958. Process 2680 stopped * thread rust-lang#15, name = '<lldb.process.internal-state(pid=2712)>', stop reason = hit program assert frame rust-lang#4: 0x000000010cdf3cdc liblldb.20.0.0git.dylib`DWARFASTParserClang::ExtractIntFromFormValue(lldb_private::CompilerType const&, lldb_private::plugin::dwarf::DWARFFormValue const&) const (.cold.1) + liblldb.20.0.0git.dylib`DWARFASTParserClang::ParseObjCMethod(lldb_private::ObjCLanguage::MethodName const&, lldb_private::plugin::dwarf::DWARFDIE const&, lldb_private::CompilerType, ParsedDWARFTypeAttributes , bool) (.cold.1): -> 0x10cdf3cdc <+0>: stp x29, x30, [sp, #-0x10]! 0x10cdf3ce0 <+4>: mov x29, sp 0x10cdf3ce4 <+8>: adrp x0, 545 0x10cdf3ce8 <+12>: add x0, x0, #0xa25 ; "ParseObjCMethod" Target 0: (lldb) stopped. (lldb) bt * thread rust-lang#15, name = '<lldb.process.internal-state(pid=2712)>', stop reason = hit program assert frame #0: 0x0000000180d08600 libsystem_kernel.dylib`__pthread_kill + 8 frame #1: 0x0000000180d40f50 libsystem_pthread.dylib`pthread_kill + 288 frame rust-lang#2: 0x0000000180c4d908 libsystem_c.dylib`abort + 128 frame rust-lang#3: 0x0000000180c4cc1c libsystem_c.dylib`__assert_rtn + 284 * frame rust-lang#4: 0x000000010cdf3cdc liblldb.20.0.0git.dylib`DWARFASTParserClang::ExtractIntFromFormValue(lldb_private::CompilerType const&, lldb_private::plugin::dwarf::DWARFFormValue const&) const (.cold.1) + frame rust-lang#5: 0x0000000109d30acc liblldb.20.0.0git.dylib`lldb_private::TypeSystemClang::GetBuiltinTypeForEncodingAndBitSize(lldb::Encoding, unsigned long) + 1188 frame rust-lang#6: 0x0000000109aaaed4 liblldb.20.0.0git.dylib`DynamicLoaderMacOS::NotifyBreakpointHit(void*, lldb_private::StoppointCallbackContext*, unsigned long long, unsigned long long) + 384 ``` This patch adds a one-time user-visible warning for when we fail to initialize the AST to indicate that initialization went wrong for the given target. Additionally, we add checks for whether one of the `ASTContext` `QualType`s is invalid before dereferencing any builtin types. The warning would look as follows: ``` (lldb) target create "a.out" Current executable set to 'a.out' (arm64). (lldb) b main warning: Failed to initialize builtin ASTContext types for target 'some-unknown-triple'. Printing variables may behave unexpectedly. Breakpoint 1: where = a.out`main + 8 at stepping.cpp:5:14, address = 0x0000000100003f90 ``` rdar://134869779
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Oct 4, 2024
Fixes llvm#102703. https://godbolt.org/z/nfj8xsb1Y The following pattern: ``` %2 = and i32 %0, 254 %3 = icmp eq i32 %2, 0 ``` is optimised by instcombine into: ```%3 = icmp ult i32 %0, 2``` However, post instcombine leads to worse aarch64 than the unoptimised version. Pre instcombine: ``` tst w0, #0xfe cset w0, eq ret ``` Post instcombine: ``` and w8, w0, #0xff cmp w8, rust-lang#2 cset w0, lo ret ``` In the unoptimised version, SelectionDAG converts `SETCC (AND X 254) 0 EQ` into `CSEL 0 1 1 (ANDS X 254)`, which gets emitted as a `tst`. In the optimised version, SelectionDAG converts `SETCC (AND X 255) 2 ULT` into `CSEL 0 1 2 (SUBS (AND X 255) 2)`, which gets emitted as an `and`/`cmp`. This PR adds an optimisation to `AArch64ISelLowering`, converting `SETCC (AND X Y) Z ULT` into `SETCC (AND X (Y & ~(Z - 1))) 0 EQ` when `Z` is a power of two. This makes SelectionDAG/Codegen produce the same optimised code for both examples.
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Oct 15, 2024
…1409)" This reverts commit a89e016. This is being reverted because it broke the test: Unwind/trap_frame_sym_ctx.test /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test:21:10: error: CHECK: expected string not found in input CHECK: frame rust-lang#2: {{.*}}`main
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Nov 1, 2024
…ates explicitly specialized for an implicitly instantiated class template specialization (llvm#113464) Consider the following: ``` template<typename T> struct A { template<typename U> struct B { static constexpr int x = 0; // #1 }; template<typename U> struct B<U*> { static constexpr int x = 1; // rust-lang#2 }; }; template<> template<typename U> struct A<long>::B { static constexpr int x = 2; // rust-lang#3 }; static_assert(A<short>::B<int>::y == 0); // uses #1 static_assert(A<short>::B<int*>::y == 1); // uses rust-lang#2 static_assert(A<long>::B<int>::y == 2); // uses rust-lang#3 static_assert(A<long>::B<int*>::y == 2); // uses rust-lang#3 ``` According to [temp.spec.partial.member] p2: > If the primary member template is explicitly specialized for a given (implicit) specialization of the enclosing class template, the partial specializations of the member template are ignored for this specialization of the enclosing class template. If a partial specialization of the member template is explicitly specialized for a given (implicit) specialization of the enclosing class template, the primary member template and its other partial specializations are still considered for this specialization of the enclosing class template. The example above fails to compile because we currently don't implement [temp.spec.partial.member] p2. This patch implements the wording, fixing llvm#51051.
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Nov 11, 2024
… depobj construct (llvm#114221) A codegen crash is occurring when a depend object was initialized with omp_all_memory in the depobj directive. llvm#114214 The root cause of issue looks to be the improper handling of the dependency list when omp_all_memory was specified. The change introduces the use of OMPTaskDataTy to manage dependencies. The buildDependences function is called to construct the dependency list, and the list is iterated over to emit and store the dependencies. Reduced Test Case : ``` #include <omp.h> int main() { omp_depend_t obj; #pragma omp depobj(obj) depend(inout: omp_all_memory) } ``` ``` #1 0x0000000003de6623 SignalHandler(int) Signals.cpp:0:0 rust-lang#2 0x00007f8e4a6b990f (/lib64/libpthread.so.0+0x1690f) rust-lang#3 0x00007f8e4a117d2a raise (/lib64/libc.so.6+0x4ad2a) rust-lang#4 0x00007f8e4a1193e4 abort (/lib64/libc.so.6+0x4c3e4) rust-lang#5 0x00007f8e4a10fc69 __assert_fail_base (/lib64/libc.so.6+0x42c69) rust-lang#6 0x00007f8e4a10fcf1 __assert_fail (/lib64/libc.so.6+0x42cf1) rust-lang#7 0x0000000004114367 clang::CodeGen::CodeGenFunction::EmitOMPDepobjDirective(clang::OMPDepobjDirective const&) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x4114367) rust-lang#8 0x00000000040f8fac clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x40f8fac) rust-lang#9 0x00000000040ff4fb clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x40ff4fb) rust-lang#10 0x00000000041847b2 clang::CodeGen::CodeGenFunction::EmitFunctionBody(clang::Stmt const*) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x41847b2) rust-lang#11 0x0000000004199e4a clang::CodeGen::CodeGenFunction::GenerateCode(clang::GlobalDecl, llvm::Function*, clang::CodeGen::CGFunctionInfo const&) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x4199e4a) rust-lang#12 0x00000000041f7b9d clang::CodeGen::CodeGenModule::EmitGlobalFunctionDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x41f7b9d) rust-lang#13 0x00000000041f16a3 clang::CodeGen::CodeGenModule::EmitGlobalDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x41f16a3) rust-lang#14 0x00000000041fd954 clang::CodeGen::CodeGenModule::EmitDeferred() (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x41fd954) rust-lang#15 0x0000000004200277 clang::CodeGen::CodeGenModule::Release() (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x4200277) rust-lang#16 0x00000000046b6a49 (anonymous namespace)::CodeGeneratorImpl::HandleTranslationUnit(clang::ASTContext&) ModuleBuilder.cpp:0:0 rust-lang#17 0x00000000046b4cb6 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x46b4cb6) rust-lang#18 0x0000000006204d5c clang::ParseAST(clang::Sema&, bool, bool) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x6204d5c) rust-lang#19 0x000000000496b278 clang::FrontendAction::Execute() (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x496b278) rust-lang#20 0x00000000048dd074 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x48dd074) rust-lang#21 0x0000000004a38092 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x4a38092) rust-lang#22 0x0000000000fd4e9c cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0xfd4e9c) rust-lang#23 0x0000000000fcca73 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0 rust-lang#24 0x0000000000fd140c clang_main(int, char**, llvm::ToolContext const&) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0xfd140c) rust-lang#25 0x0000000000ee2ef3 main (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0xee2ef3) rust-lang#26 0x00007f8e4a10224c __libc_start_main (/lib64/libc.so.6+0x3524c) rust-lang#27 0x0000000000fcaae9 _start /home/abuild/rpmbuild/BUILD/glibc-2.31/csu/../sysdeps/x86_64/start.S:120:0 clang: error: unable to execute command: Aborted ``` --------- Co-authored-by: Chandra Ghale <ghale@pe31.hpc.amslabs.hpecorp.net>
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Nov 19, 2024
…onger cause a crash (llvm#116569) This PR fixes a bug introduced by llvm#110199, which causes any half float argument to crash the compiler on MIPS64. Currently compiling this bit of code with `llc -mtriple=mips64`: ``` define void @half_args(half %a) nounwind { entry: ret void } ``` Crashes with the following log: ``` LLVM ERROR: unable to allocate function argument #0 PLEASE submit a bug report to /~https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Stack dump: 0. Program arguments: llc -mtriple=mips64 1. Running pass 'Function Pass Manager' on module '<stdin>'. 2. Running pass 'MIPS DAG->DAG Pattern Instruction Selection' on function '@half_args' #0 0x000055a3a4013df8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x32d0df8) #1 0x000055a3a401199e llvm::sys::RunSignalHandlers() (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x32ce99e) rust-lang#2 0x000055a3a40144a8 SignalHandler(int) Signals.cpp:0:0 rust-lang#3 0x00007f00bde558c0 __restore_rt libc_sigaction.c:0:0 rust-lang#4 0x00007f00bdea462c __pthread_kill_implementation ./nptl/pthread_kill.c:44:76 rust-lang#5 0x00007f00bde55822 gsignal ./signal/../sysdeps/posix/raise.c:27:6 rust-lang#6 0x00007f00bde3e4af abort ./stdlib/abort.c:81:7 rust-lang#7 0x000055a3a3f80e3c llvm::report_fatal_error(llvm::Twine const&, bool) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x323de3c) rust-lang#8 0x000055a3a2e20dfa (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x20dddfa) rust-lang#9 0x000055a3a2a34e20 llvm::MipsTargetLowering::LowerFormalArguments(llvm::SDValue, unsigned int, bool, llvm::SmallVectorImpl<llvm::ISD::InputArg> const&, llvm::SDLoc const&, llvm::SelectionDAG&, llvm::SmallVectorImpl<llvm::SDValue>&) const MipsISelLowering.cpp:0:0 rust-lang#10 0x000055a3a3d896a9 llvm::SelectionDAGISel::LowerArguments(llvm::Function const&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x30466a9) rust-lang#11 0x000055a3a3e0b3ec llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x30c83ec) rust-lang#12 0x000055a3a3e09e21 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x30c6e21) rust-lang#13 0x000055a3a2aae1ca llvm::MipsDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) MipsISelDAGToDAG.cpp:0:0 rust-lang#14 0x000055a3a3e07706 llvm::SelectionDAGISelLegacy::runOnMachineFunction(llvm::MachineFunction&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x30c4706) rust-lang#15 0x000055a3a3051ed6 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x230eed6) rust-lang#16 0x000055a3a35a3ec9 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x2860ec9) rust-lang#17 0x000055a3a35ac3b2 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x28693b2) rust-lang#18 0x000055a3a35a499c llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x286199c) rust-lang#19 0x000055a3a262abbb main (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x18e7bbb) rust-lang#20 0x00007f00bde3fc4c __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:74:3 rust-lang#21 0x00007f00bde3fd05 call_init ./csu/../csu/libc-start.c:128:20 rust-lang#22 0x00007f00bde3fd05 __libc_start_main@GLIBC_2.2.5 ./csu/../csu/libc-start.c:347:5 rust-lang#23 0x000055a3a2624921 _start /builddir/glibc-2.39/csu/../sysdeps/x86_64/start.S:117:0 ``` This is caused by the fact that after the change, `f16`s are no longer lowered as `f32`s in calls. Two possible fixes are available: - Update calling conventions to properly support passing `f16` as integers. - Update `useFPRegsForHalfType()` to return `true` so that `f16` are still kept in `f32` registers, as before llvm#110199. This PR implements the first solution to not introduce any more ABI changes as llvm#110199 already did. As of what is the correct ABI for halfs, I don't think there is a correct answer. GCC doesn't support halfs on MIPS, and I couldn't find any information on old MIPS ABI manuals either.
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Nov 21, 2024
…#116656) The main issue to solve is that OpenMP modifiers can be specified in any order, so the parser cannot expect any specific modifier at a given position. To solve that, define modifier to be a union of all allowable specific modifiers for a given clause. Additionally, implement modifier descriptors: for each modifier the corresponding descriptor contains a set of properties of the modifier that allow a common set of semantic checks. Start with the syntactic properties defined in the spec: Required, Unique, Exclusive, Ultimate, and implement common checks to verify each of them. OpenMP modifier overhaul: rust-lang#2/3
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Nov 28, 2024
…ew API implementation (llvm#108413. llvm#117704) (llvm#117894) Relands llvm#117704, which relanded changes from llvm#108413 - this was reverted due to build issues. The new offload library did not build with `LIBOMPTARGET_OMPT_SUPPORT` enabled, which was not picked up by pre-merge testing. The last commit contains the fix; everything else is otherwise identical to the approved PR. ___ ### New API Previous discussions at the LLVM/Offload meeting have brought up the need for a new API for exposing the functionality of the plugins. This change introduces a very small subset of a new API, which is primarily for testing the offload tooling and demonstrating how a new API can fit into the existing code base without being too disruptive. Exact designs for these entry points and future additions can be worked out over time. The new API does however introduce the bare minimum functionality to implement device discovery for Unified Runtime and SYCL. This means that the `urinfo` and `sycl-ls` tools can be used on top of Offload. A (rough) implementation of a Unified Runtime adapter (aka plugin) for Offload is available [here](/~https://github.com/callumfare/unified-runtime/tree/offload_adapter). Our intention is to maintain this and use it to implement and test Offload API changes with SYCL. ### Demoing the new API ```sh # From the runtime build directory $ ninja LibomptUnitTests $ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests ``` ### Open questions and future work * Only some of the available device info is exposed, and not all the possible device queries needed for SYCL are implemented by the plugins. A sensible next step would be to refactor and extend the existing device info queries in the plugins. The existing info queries are all strings, but the new API introduces the ability to return any arbitrary type. * It may be sensible at some point for the plugins to implement the new API directly, and the higher level code on top of it could be made generic, but this is more of a long-term possibility.
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Nov 28, 2024
…abort (llvm#117603) Hey guys, I found that Flang's built-in ABORT function is incomplete when I was using it. Compared with gfortran's ABORT (which can both abort and print out a backtrace), flang's ABORT implementation lacks the function of printing out a backtrace. This feature is essential for debugging and understanding the call stack at the failure point. To solve this problem, I completed the "// TODO:" of the abort function, and then implemented an additional built-in function BACKTRACE for flang. After a brief reading of the relevant source code, I used backtrace and backtrace_symbols in "execinfo.h" to quickly implement this. But since I used the above two functions directly, my implementation is slightly different from gfortran's implementation (in the output, the function call stack before main is additionally output, and the function line number is missing). In addition, since I used the above two functions, I did not need to add -g to embed debug information into the ELF file, but needed -rdynamic to ensure that the symbols are added to the dynamic symbol table (so that the function name will be printed out). Here is a comparison of the output between gfortran 's backtrace and my implementation: gfortran's implemention output: ``` #0 0x557eb71f4184 in testfun2_ at /home/hunter/plct/fortran/test.f90:5 #1 0x557eb71f4165 in testfun1_ at /home/hunter/plct/fortran/test.f90:13 rust-lang#2 0x557eb71f4192 in test_backtrace at /home/hunter/plct/fortran/test.f90:17 rust-lang#3 0x557eb71f41ce in main at /home/hunter/plct/fortran/test.f90:18 ``` my impelmention output: ``` Backtrace: #0 ./test(_FortranABacktrace+0x32) [0x574f07efcf92] #1 ./test(testfun2_+0x14) [0x574f07efc7b4] rust-lang#2 ./test(testfun1_+0xd) [0x574f07efc7cd] rust-lang#3 ./test(_QQmain+0x9) [0x574f07efc7e9] rust-lang#4 ./test(main+0x12) [0x574f07efc802] rust-lang#5 /usr/lib/libc.so.6(+0x25e08) [0x76954694fe08] rust-lang#6 /usr/lib/libc.so.6(__libc_start_main+0x8c) [0x76954694fecc] rust-lang#7 ./test(_start+0x25) [0x574f07efc6c5] ``` test program is: ``` function testfun2() result(err) implicit none integer :: err err = 1 call backtrace end function testfun2 subroutine testfun1() implicit none integer :: err integer :: testfun2 err = testfun2() end subroutine testfun1 program test_backtrace call testfun1() end program test_backtrace ``` I am well aware of the importance of line numbers, so I am now working on implementing line numbers (by parsing DWARF information) and supporting cross-platform (Windows) support.
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Nov 28, 2024
…nitial new API implementation (llvm#108413. llvm#117704)" (llvm#117995) Reverts llvm#117894 Buildbot failures in OpenMP/Offload bots. https://lab.llvm.org/buildbot/#/builders/30/builds/11193
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Dec 4, 2024
…ne symbol size as symbols are created (llvm#117079)" This reverts commit ba668eb. Below test started failing again on x86_64 macOS CI. We're unsure if this patch is the exact cause, but since this patch has broken this test before, we speculatively revert it to see if it was indeed the root cause. ``` FAIL: lldb-shell :: Unwind/trap_frame_sym_ctx.test (1692 of 2162) ******************** TEST 'lldb-shell :: Unwind/trap_frame_sym_ctx.test' FAILED ******************** Exit Code: 1 Command Output (stderr): -- RUN: at line 7: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/clang --target=specify-a-target-or-use-a-_host-substitution --target=x86_64-apple-darwin22.6.0 -isysroot /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/call-asm.c /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/trap_frame_sym_ctx.s -o /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp + /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/clang --target=specify-a-target-or-use-a-_host-substitution --target=x86_64-apple-darwin22.6.0 -isysroot /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/call-asm.c /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/trap_frame_sym_ctx.s -o /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp clang: warning: argument unused during compilation: '-fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell' [-Wunused-command-line-argument] RUN: at line 8: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/lldb --no-lldbinit -S /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/lit-lldb-init-quiet /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp -s /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test -o exit | /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/FileCheck /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test + /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/lldb --no-lldbinit -S /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/lit-lldb-init-quiet /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp -s /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test -o exit + /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/FileCheck /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test:21:10: error: CHECK: expected string not found in input ^ <stdin>:26:64: note: scanning from here frame #1: 0x0000000100003ee9 trap_frame_sym_ctx.test.tmp`tramp ^ <stdin>:27:2: note: possible intended match here frame rust-lang#2: 0x00007ff7bfeff6c0 ^ Input file: <stdin> Check file: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test -dump-input=help explains the following input dump. Input was: <<<<<< . . . 21: 0x100003ed1 <+0>: pushq %rbp 22: 0x100003ed2 <+1>: movq %rsp, %rbp 23: (lldb) thread backtrace -u 24: * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 25: * frame #0: 0x0000000100003ecc trap_frame_sym_ctx.test.tmp`bar 26: frame #1: 0x0000000100003ee9 trap_frame_sym_ctx.test.tmp`tramp check:21'0 X error: no match found 27: frame rust-lang#2: 0x00007ff7bfeff6c0 check:21'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ check:21'1 ? possible intended match 28: frame rust-lang#3: 0x0000000100003ec6 trap_frame_sym_ctx.test.tmp`main + 22 check:21'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 29: frame rust-lang#4: 0x0000000100003ec6 trap_frame_sym_ctx.test.tmp`main + 22 check:21'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 30: frame rust-lang#5: 0x00007ff8193cc41f dyld`start + 1903 check:21'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 31: (lldb) exit check:21'0 ~~~~~~~~~~~~ >>>>>> ```
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Dec 6, 2024
The Clang binary (and any binary linking Clang as a library), when built using PIE, ends up with a pretty shocking number of dynamic relocations to apply to the executable image: roughly 400k. Each of these takes up binary space in the executable, and perhaps most interestingly takes start-up time to apply the relocations. The largest pattern I identified were the strings used to describe target builtins. The addresses of these string literals were stored into huge arrays, each one requiring a dynamic relocation. The way to avoid this is to design the target builtins to use a single large table of strings and offsets within the table for the individual strings. This switches the builtin management to such a scheme. This saves over 100k dynamic relocations by my measurement, an over 25% reduction. Just looking at byte size improvements, using the `bloaty` tool to compare a newly built `clang` binary to an old one: ``` FILE SIZE VM SIZE -------------- -------------- +1.4% +653Ki +1.4% +653Ki .rodata +0.0% +960 +0.0% +960 .text +0.0% +197 +0.0% +197 .dynstr +0.0% +184 +0.0% +184 .eh_frame +0.0% +96 +0.0% +96 .dynsym +0.0% +40 +0.0% +40 .eh_frame_hdr +114% +32 [ = ] 0 [Unmapped] +0.0% +20 +0.0% +20 .gnu.hash +0.0% +8 +0.0% +8 .gnu.version +0.9% +7 +0.9% +7 [LOAD rust-lang#2 [R]] [ = ] 0 -75.4% -3.00Ki .relro_padding -16.1% -802Ki -16.1% -802Ki .data.rel.ro -27.3% -2.52Mi -27.3% -2.52Mi .rela.dyn -1.6% -2.66Mi -1.6% -2.66Mi TOTAL ``` We get a 16% reduction in the `.data.rel.ro` section, and nearly 30% reduction in `.rela.dyn` where those reloctaions are stored. This is also visible in my benchmarking of binary start-up overhead at least: ``` Benchmark 1: ./old_clang --version Time (mean ± σ): 17.6 ms ± 1.5 ms [User: 4.1 ms, System: 13.3 ms] Range (min … max): 14.2 ms … 22.8 ms 162 runs Benchmark 2: ./new_clang --version Time (mean ± σ): 15.5 ms ± 1.4 ms [User: 3.6 ms, System: 11.8 ms] Range (min … max): 12.4 ms … 20.3 ms 216 runs Summary './new_clang --version' ran 1.13 ± 0.14 times faster than './old_clang --version' ``` We get about 2ms faster `--version` runs. While there is a lot of noise in binary execution time, this delta is pretty consistent, and represents over 10% improvement. This is particularly interesting to me because for very short source files, repeatedly starting the `clang` binary is actually the dominant cost. For example, `configure` scripts running against the `clang` compiler are slow in large part because of binary start up time, not the time to process the actual inputs to the compiler. ---- This PR implements the string tables using `constexpr` code and the existing macro system. I understand that the builtins are moving towards a TableGen model, and if complete that would provide more options for modeling this. Unfortunately, that migration isn't complete, and even the parts that are migrated still rely on the ability to break out of the TableGen model and directly expand an X-macro style `BUILTIN(...)` textually. I looked at trying to complete the move to TableGen, but it would both require the difficult migration of the remaining targets, and solving some tricky problems with how to move away from any macro-based expansion. I was also able to find a reasonably clean and effective way of doing this with the existing macros and some `constexpr` code that I think is clean enough to be a pretty good intermediate state, and maybe give a good target for the eventual TableGen solution. I was also able to factor the macros into set of consistent patterns that avoids a significant regression in overall boilerplate. There is one challenge with this approach: it requires the host compiler to support (very) long string literals, a bit over half a meg. =/ The current minimum MSVC version rejects these, but the very next patch release (16.8) removed that restriction. I'm going to send out a separate PR / RFC to raise the minimum version by one patch release, which I hope is acceptable as the current version was set years ago. FWIW, there are a few more low-hanging fruit sources of excessive dynamic relocations, maybe as many as 50k to 100k more that I'll take a look at to see if I can identify easy fixes. Beyond that, it seems to get quite difficult. It might be worth adding some guidance to developer documentation to try to avoid creating global data structures that _repeatedly_ store pointers to other globals. Fix from code review
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Dec 8, 2024
## Description This PR fixes a segmentation fault that occurs when passing options requiring arguments via `-Xopenmp-target=<triple>`. The issue was that the function `Driver::getOffloadArchs` did not properly parse the extracted option, but instead assumed it was valid, leading to a crash when incomplete arguments were provided. ## Backtrace ```sh llvm-project/build/bin/clang++ main.cpp -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu -Xopenmp-target=powerpc64le-ibm-linux-gnu -o PLEASE submit a bug report to /~https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: llvm-project/build/bin/clang++ main.cpp -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu -Xopenmp-target=powerpc64le-ibm-linux-gnu -o 1. Compilation construction 2. Building compilation actions #0 0x0000562fb21c363b llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (llvm-project/build/bin/clang+++0x392f63b) #1 0x0000562fb21c0e3c SignalHandler(int) Signals.cpp:0:0 rust-lang#2 0x00007fcbf6c81420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420) rust-lang#3 0x0000562fb1fa5d70 llvm::opt::Option::matches(llvm::opt::OptSpecifier) const (llvm-project/build/bin/clang+++0x3711d70) rust-lang#4 0x0000562fb2a78e7d clang::driver::Driver::getOffloadArchs(clang::driver::Compilation&, llvm::opt::DerivedArgList const&, clang::driver::Action::OffloadKind, clang::driver::ToolChain const*, bool) const (llvm-project/build/bin/clang+++0x41e4e7d) rust-lang#5 0x0000562fb2a7a9aa clang::driver::Driver::BuildOffloadingActions(clang::driver::Compilation&, llvm::opt::DerivedArgList&, std::pair<clang::driver::types::ID, llvm::opt::Arg const*> const&, clang::driver::Action*) const (.part.1164) Driver.cpp:0:0 rust-lang#6 0x0000562fb2a7c093 clang::driver::Driver::BuildActions(clang::driver::Compilation&, llvm::opt::DerivedArgList&, llvm::SmallVector<std::pair<clang::driver::types::ID, llvm::opt::Arg const*>, 16u> const&, llvm::SmallVector<clang::driver::Action*, 3u>&) const (llvm-project/build/bin/clang+++0x41e8093) rust-lang#7 0x0000562fb2a8395d clang::driver::Driver::BuildCompilation(llvm::ArrayRef<char const*>) (llvm-project/build/bin/clang+++0x41ef95d) rust-lang#8 0x0000562faf92684c clang_main(int, char**, llvm::ToolContext const&) (llvm-project/build/bin/clang+++0x109284c) rust-lang#9 0x0000562faf826cc6 main (llvm-project/build/bin/clang+++0xf92cc6) rust-lang#10 0x00007fcbf6699083 __libc_start_main /build/glibc-LcI20x/glibc-2.31/csu/../csu/libc-start.c:342:3 rust-lang#11 0x0000562faf923a5e _start (llvm-project/build/bin/clang+++0x108fa5e) [1] 2628042 segmentation fault (core dumped) main.cpp -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu -o ```
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Dec 8, 2024
llvm#118923) …d reentry. These utilities provide new, more generic and easier to use support for lazy compilation in ORC. LazyReexportsManager is an alternative to LazyCallThroughManager. It takes requests for lazy re-entry points in the form of an alias map: lazy-reexports = { ( <entry point symbol #1>, <implementation symbol #1> ), ( <entry point symbol rust-lang#2>, <implementation symbol rust-lang#2> ), ... ( <entry point symbol #n>, <implementation symbol #n> ) } LazyReexportsManager then: 1. binds the entry points to the implementation names in an internal table. 2. creates a JIT re-entry trampoline for each entry point. 3. creates a redirectable symbol for each of the entry point name and binds redirectable symbol to the corresponding reentry trampoline. When an entry point symbol is first called at runtime (which may be on any thread of the JIT'd program) it will re-enter the JIT via the trampoline and trigger a lookup for the implementation symbol stored in LazyReexportsManager's internal table. When the lookup completes the entry point symbol will be updated (via the RedirectableSymbolManager) to point at the implementation symbol, and execution will proceed to the implementation symbol. Actual construction of the re-entry trampolines and redirectable symbols is delegated to an EmitTrampolines functor and the RedirectableSymbolsManager respectively. JITLinkReentryTrampolines.h provides a JITLink-based implementation of the EmitTrampolines functor. (AArch64 only in this patch, but other architectures will be added in the near future). Register state save and reentry functionality is added to the ORC runtime in the __orc_rt_sysv_resolve and __orc_rt_resolve_implementation functions (the latter is generic, the former will need custom implementations for each ABI and architecture to be supported, however this should be much less effort than the existing OrcABISupport approach, since the ORC runtime allows this code to be written as native assembly). The resulting system: 1. Works equally well for in-process and out-of-process JIT'd code. 2. Requires less boilerplate to set up. Given an ObjectLinkingLayer and PlatformJD (JITDylib containing the ORC runtime), setup is just: ```c++ auto RSMgr = JITLinkRedirectableSymbolManager::Create(OLL); if (!RSMgr) return RSMgr.takeError(); auto LRMgr = createJITLinkLazyReexportsManager(OLL, **RSMgr, PlatformJD); if (!LRMgr) return LRMgr.takeError(); ``` after which lazy reexports can be introduced with: ```c++ JD.define(lazyReexports(LRMgr, <alias map>)); ``` LazyObectLinkingLayer is updated to use this new method, but the LLVM-IR level CompileOnDemandLayer will continue to use LazyCallThroughManager and OrcABISupport until the new system supports a wider range of architectures and ABIs. The llvm-jitlink utility's -lazy option now uses the new scheme. Since it depends on the ORC runtime, the lazy-link.ll testcase and associated helpers are moved to the ORC runtime.
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Dec 8, 2024
The Clang binary (and any binary linking Clang as a library), when built using PIE, ends up with a pretty shocking number of dynamic relocations to apply to the executable image: roughly 400k. Each of these takes up binary space in the executable, and perhaps most interestingly takes start-up time to apply the relocations. The largest pattern I identified were the strings used to describe target builtins. The addresses of these string literals were stored into huge arrays, each one requiring a dynamic relocation. The way to avoid this is to design the target builtins to use a single large table of strings and offsets within the table for the individual strings. This switches the builtin management to such a scheme. This saves over 100k dynamic relocations by my measurement, an over 25% reduction. Just looking at byte size improvements, using the `bloaty` tool to compare a newly built `clang` binary to an old one: ``` FILE SIZE VM SIZE -------------- -------------- +1.4% +653Ki +1.4% +653Ki .rodata +0.0% +960 +0.0% +960 .text +0.0% +197 +0.0% +197 .dynstr +0.0% +184 +0.0% +184 .eh_frame +0.0% +96 +0.0% +96 .dynsym +0.0% +40 +0.0% +40 .eh_frame_hdr +114% +32 [ = ] 0 [Unmapped] +0.0% +20 +0.0% +20 .gnu.hash +0.0% +8 +0.0% +8 .gnu.version +0.9% +7 +0.9% +7 [LOAD rust-lang#2 [R]] [ = ] 0 -75.4% -3.00Ki .relro_padding -16.1% -802Ki -16.1% -802Ki .data.rel.ro -27.3% -2.52Mi -27.3% -2.52Mi .rela.dyn -1.6% -2.66Mi -1.6% -2.66Mi TOTAL ``` We get a 16% reduction in the `.data.rel.ro` section, and nearly 30% reduction in `.rela.dyn` where those reloctaions are stored. This is also visible in my benchmarking of binary start-up overhead at least: ``` Benchmark 1: ./old_clang --version Time (mean ± σ): 17.6 ms ± 1.5 ms [User: 4.1 ms, System: 13.3 ms] Range (min … max): 14.2 ms … 22.8 ms 162 runs Benchmark 2: ./new_clang --version Time (mean ± σ): 15.5 ms ± 1.4 ms [User: 3.6 ms, System: 11.8 ms] Range (min … max): 12.4 ms … 20.3 ms 216 runs Summary './new_clang --version' ran 1.13 ± 0.14 times faster than './old_clang --version' ``` We get about 2ms faster `--version` runs. While there is a lot of noise in binary execution time, this delta is pretty consistent, and represents over 10% improvement. This is particularly interesting to me because for very short source files, repeatedly starting the `clang` binary is actually the dominant cost. For example, `configure` scripts running against the `clang` compiler are slow in large part because of binary start up time, not the time to process the actual inputs to the compiler. ---- This PR implements the string tables using `constexpr` code and the existing macro system. I understand that the builtins are moving towards a TableGen model, and if complete that would provide more options for modeling this. Unfortunately, that migration isn't complete, and even the parts that are migrated still rely on the ability to break out of the TableGen model and directly expand an X-macro style `BUILTIN(...)` textually. I looked at trying to complete the move to TableGen, but it would both require the difficult migration of the remaining targets, and solving some tricky problems with how to move away from any macro-based expansion. I was also able to find a reasonably clean and effective way of doing this with the existing macros and some `constexpr` code that I think is clean enough to be a pretty good intermediate state, and maybe give a good target for the eventual TableGen solution. I was also able to factor the macros into set of consistent patterns that avoids a significant regression in overall boilerplate. There is one challenge with this approach: it requires the host compiler to support (very) long string literals, a bit over half a meg. =/ The current minimum MSVC version rejects these, but the very next patch release (16.8) removed that restriction. I'm going to send out a separate PR / RFC to raise the minimum version by one patch release, which I hope is acceptable as the current version was set years ago. FWIW, there are a few more low-hanging fruit sources of excessive dynamic relocations, maybe as many as 50k to 100k more that I'll take a look at to see if I can identify easy fixes. Beyond that, it seems to get quite difficult. It might be worth adding some guidance to developer documentation to try to avoid creating global data structures that _repeatedly_ store pointers to other globals. Fix from code review switch back to clang-specific warning suppression review feedback format
nikic
pushed a commit
to nikic/llvm-project
that referenced
this pull request
Dec 9, 2024
The Clang binary (and any binary linking Clang as a library), when built using PIE, ends up with a pretty shocking number of dynamic relocations to apply to the executable image: roughly 400k. Each of these takes up binary space in the executable, and perhaps most interestingly takes start-up time to apply the relocations. The largest pattern I identified were the strings used to describe target builtins. The addresses of these string literals were stored into huge arrays, each one requiring a dynamic relocation. The way to avoid this is to design the target builtins to use a single large table of strings and offsets within the table for the individual strings. This switches the builtin management to such a scheme. This saves over 100k dynamic relocations by my measurement, an over 25% reduction. Just looking at byte size improvements, using the `bloaty` tool to compare a newly built `clang` binary to an old one: ``` FILE SIZE VM SIZE -------------- -------------- +1.4% +653Ki +1.4% +653Ki .rodata +0.0% +960 +0.0% +960 .text +0.0% +197 +0.0% +197 .dynstr +0.0% +184 +0.0% +184 .eh_frame +0.0% +96 +0.0% +96 .dynsym +0.0% +40 +0.0% +40 .eh_frame_hdr +114% +32 [ = ] 0 [Unmapped] +0.0% +20 +0.0% +20 .gnu.hash +0.0% +8 +0.0% +8 .gnu.version +0.9% +7 +0.9% +7 [LOAD rust-lang#2 [R]] [ = ] 0 -75.4% -3.00Ki .relro_padding -16.1% -802Ki -16.1% -802Ki .data.rel.ro -27.3% -2.52Mi -27.3% -2.52Mi .rela.dyn -1.6% -2.66Mi -1.6% -2.66Mi TOTAL ``` We get a 16% reduction in the `.data.rel.ro` section, and nearly 30% reduction in `.rela.dyn` where those reloctaions are stored. This is also visible in my benchmarking of binary start-up overhead at least: ``` Benchmark 1: ./old_clang --version Time (mean ± σ): 17.6 ms ± 1.5 ms [User: 4.1 ms, System: 13.3 ms] Range (min … max): 14.2 ms … 22.8 ms 162 runs Benchmark 2: ./new_clang --version Time (mean ± σ): 15.5 ms ± 1.4 ms [User: 3.6 ms, System: 11.8 ms] Range (min … max): 12.4 ms … 20.3 ms 216 runs Summary './new_clang --version' ran 1.13 ± 0.14 times faster than './old_clang --version' ``` We get about 2ms faster `--version` runs. While there is a lot of noise in binary execution time, this delta is pretty consistent, and represents over 10% improvement. This is particularly interesting to me because for very short source files, repeatedly starting the `clang` binary is actually the dominant cost. For example, `configure` scripts running against the `clang` compiler are slow in large part because of binary start up time, not the time to process the actual inputs to the compiler. ---- This PR implements the string tables using `constexpr` code and the existing macro system. I understand that the builtins are moving towards a TableGen model, and if complete that would provide more options for modeling this. Unfortunately, that migration isn't complete, and even the parts that are migrated still rely on the ability to break out of the TableGen model and directly expand an X-macro style `BUILTIN(...)` textually. I looked at trying to complete the move to TableGen, but it would both require the difficult migration of the remaining targets, and solving some tricky problems with how to move away from any macro-based expansion. I was also able to find a reasonably clean and effective way of doing this with the existing macros and some `constexpr` code that I think is clean enough to be a pretty good intermediate state, and maybe give a good target for the eventual TableGen solution. I was also able to factor the macros into set of consistent patterns that avoids a significant regression in overall boilerplate.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
(Rebase and duplicate of rust-lang/llvm#134)
This is an experimental patch set for LLVM to emit more accurate DWARF information for the WebAssembly: it extends DWARF expression language to express locals/globals locations (via
target-index operands atm).
The WebAssemblyExplicitLocals can replace virtual registers to target-index operand type at the time when WebAssembly backend introduces {set,tee}_local instead of corresponding virtual registers.
The subroutine frame base address can contain WebAssembly location (e.g. local)
See also rust-lang/llvm#134 (comment)