Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

illumos should put libc last in library search order #84254

Merged
merged 1 commit into from
May 7, 2021

Conversation

jclulow
Copy link
Contributor

@jclulow jclulow commented Apr 16, 2021

Under some conditions, the toolchain will produce a sequence of linker
arguments that result in a NEEDED list that puts libc before libgcc_s;
e.g.,

[0]  NEEDED            0x2046ba            libc.so.1
[1]  NEEDED            0x204723            libm.so.2
[2]  NEEDED            0x204736            libsocket.so.1
[3]  NEEDED            0x20478b            libumem.so.1
[4]  NEEDED            0x204763            libgcc_s.so.1

Both libc and libgcc_s provide an unwinder implementation, but libgcc_s
provides some extra symbols upon which Rust directly depends. If libc
is first in the NEEDED list we will find some of those symbols in libc
but others in libgcc_s, resulting in undefined behaviour as the two
implementations do not use compatible interior data structures.

This solution is not perfect, but is the simplest way to produce correct
binaries on illumos for now.

@rust-highfive
Copy link
Collaborator

r? @matthewjasper

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 16, 2021
@jclulow
Copy link
Contributor Author

jclulow commented Apr 16, 2021

Some additional notes on the symbols in question. You can see that some symbols are implemented by both libc and libgcc_s, but there are a few newer additions to libgcc_s that are not presently in the base OS:

$ nm -p /usr/lib/amd64/libc.so.1 /usr/lib/amd64/libgcc_s.so.1 |
    awk '/:$/ { f = $0; sub(":", "", f); sub(".*/", "", f); }
        $NF ~ /^_Unwind/ { c[$NF] += 1; l[$NF] = l[$NF]" "f; }
        END { for (f in c) { printf("%d %s %s\n", c[f], f, l[f]); }}' |
    sort -b -k 1,1n -k 3,3 | column -t

1  _Unwind_ForcedUnwind_Body        libc.so.1
1  _Unwind_RaiseException_Body      libc.so.1
1  _Unwind_Backtrace                libgcc_s.so.1
1  _Unwind_DebugHook                libgcc_s.so.1
1  _Unwind_Find_FDE                 libgcc_s.so.1
1  _Unwind_FindEnclosingFunction    libgcc_s.so.1
1  _Unwind_ForcedUnwind_Phase2      libgcc_s.so.1
1  _Unwind_GetDataRelBase           libgcc_s.so.1
1  _Unwind_GetIPInfo                libgcc_s.so.1
1  _Unwind_GetTextRelBase           libgcc_s.so.1
1  _Unwind_IteratePhdrCallback      libgcc_s.so.1
1  _Unwind_RaiseException_Phase2    libgcc_s.so.1
1  _Unwind_Resume_or_Rethrow        libgcc_s.so.1
2  _Unwind_DeleteException          libc.so.1      libgcc_s.so.1
2  _Unwind_ForcedUnwind             libc.so.1      libgcc_s.so.1
2  _Unwind_GetCFA                   libc.so.1      libgcc_s.so.1
2  _Unwind_GetGR                    libc.so.1      libgcc_s.so.1
2  _Unwind_GetIP                    libc.so.1      libgcc_s.so.1
2  _Unwind_GetLanguageSpecificData  libc.so.1      libgcc_s.so.1
2  _Unwind_GetRegionStart           libc.so.1      libgcc_s.so.1
2  _Unwind_RaiseException           libc.so.1      libgcc_s.so.1
2  _Unwind_Resume                   libc.so.1      libgcc_s.so.1
2  _Unwind_SetGR                    libc.so.1      libgcc_s.so.1
2  _Unwind_SetIP                    libc.so.1      libgcc_s.so.1

Of the symbols that don't overlap, these are presently used within Rust:

$ nm -p /usr/lib/amd64/libc.so.1 /usr/lib/amd64/libgcc_s.so.1 | awk '/:$/ { f = $0; sub(":", "", f); sub(".*/", "", f); } $NF ~ /^_Unwind/ { c[$NF] += 1; l[$NF] = l[$NF]" "f; } END { for (f in c) { printf("%d %s %s\n", c[f], f, l[f]); }}' | sort -b -k 1,1n -k 3,3 | column -t | awk '$1 == 1 && $3 == "libgcc_s.so.1" { print $2 }' > /tmp/pats.txt

$ rg -Io -Ff /tmp/pats.txt /ws/rust/{library,compiler} | sort -u

_Unwind_Backtrace
_Unwind_FindEnclosingFunction
_Unwind_GetDataRelBase
_Unwind_GetIPInfo
_Unwind_GetTextRelBase

With this change, I have been able to implement backtrace-rs support for illumos systems (all examples now work and all tests now pass) via: jclulow/backtrace-rs@c9072be

@rust-log-analyzer

This comment has been minimized.

@jclulow jclulow force-pushed the illumos-link-order branch from 9a10b5f to 13bc380 Compare April 16, 2021 19:14
@petrochenkov
Copy link
Contributor

petrochenkov commented Apr 16, 2021

r? @petrochenkov
I don't think this is a good way to do this.
Libraries certainly should not be special-cased in linker.rs, and also should not be linked via target specs (unless there's absolutely no other way) because then they will be linked even in no-std mode.

The preferred way to link native libraries is via attributes in /~https://github.com/rust-lang/rust/blob/master/library/unwind/src/lib.rs if it's an unwinding library, attributes in /~https://github.com/rust-lang/libc/blob/master/src/unix/mod.rs if it's a part of libc, or attributes in /~https://github.com/rust-lang/rust/blob/master/library/std/src/sys/unix/mod.rs if it's just some system library used by libstd.
(Sometime they need to be linked multiple times from different places for correct linking order, this sometimes happens with variations of libgcc, for example.)
It would be good if you made illumos follow these guidelines.

@petrochenkov petrochenkov added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 16, 2021
@jclulow
Copy link
Contributor Author

jclulow commented Apr 16, 2021

r? @petrochenkov
I don't think this is a good way to do this.
Libraries certainly should not be special-cased in linker.rs, and also should not be linked via target specs (unless there's absolutely no other way) because then they will be linked even in no-std mode.

Fair enough. It's a challenge, because we have an ordering constraint that needs to hold here (as I described above) that isn't really, as far as I can tell, something that is correctly handled by all of the other machinery today.

As for nostd mode, on an illumos system even nostd binaries will link against libc as it's the only supported way to make system calls or access other UNIX facilities, so that part isn't an issue per se.

The preferred way to link native libraries is via attributes in /~https://github.com/rust-lang/rust/blob/master/library/unwind/src/lib.rs if it's an unwinding library, attributes in /~https://github.com/rust-lang/libc/blob/master/src/unix/mod.rs if it's a part of libc, or attributes in /~https://github.com/rust-lang/rust/blob/master/library/std/src/sys/unix/mod.rs if it's just some system library used by libstd.
(Sometime they need to be linked multiple times from different places for correct linking order, this sometimes happens with variations of libgcc, for example.)
It would be good if you made illumos follow these guidelines.

As far as I can tell, we are always correctly including both libc and libgcc_s today when they are required by use of the native functions that come from those libraries. The issue isn't that libc is missing -- rather, the problem is that sometimes, Rust passes arguments to the compiler driver which get passed to the linker that place libc before libgcc_s.

When this happens, we lazily resolve each symbol at runtime using the first entry in the NEEDED list that contains it. If libc is first, then we get some symbols from there, and the rest fall through to libgcc_s. We need to place libc last, always, in the list of native libraries to link against.

I think I have found another place to perform this surgery, though. Looking at link_local_crate_native_libs_and_dependent_crate_libs()...

/// Link native libraries corresponding to the current crate and all libraries corresponding to
/// all its dependency crates.
/// FIXME: Consider combining this with the functions above adding object files for the local crate.
fn link_local_crate_native_libs_and_dependent_crate_libs<'a, B: ArchiveBuilder<'a>>(
cmd: &mut dyn Linker,
sess: &'a Session,
crate_type: CrateType,
codegen_results: &CodegenResults,
tmpdir: &Path,
) {
// Take careful note of the ordering of the arguments we pass to the linker
// here. Linkers will assume that things on the left depend on things to the
// right. Things on the right cannot depend on things on the left. This is
// all formally implemented in terms of resolving symbols (libs on the right
// resolve unknown symbols of libs on the left, but not vice versa).
//
// For this reason, we have organized the arguments we pass to the linker as
// such:
//
// 1. The local object that LLVM just generated
// 2. Local native libraries
// 3. Upstream rust libraries
// 4. Upstream native libraries
//
// The rationale behind this ordering is that those items lower down in the
// list can't depend on items higher up in the list. For example nothing can
// depend on what we just generated (e.g., that'd be a circular dependency).
// Upstream rust libraries are not allowed to depend on our local native
// libraries as that would violate the structure of the DAG, in that
// scenario they are required to link to them as well in a shared fashion.
//
// Note that upstream rust libraries may contain native dependencies as
// well, but they also can't depend on what we just started to add to the
// link line. And finally upstream native libraries can't depend on anything
// in this DAG so far because they're only dylibs and dylibs can only depend
// on other dylibs (e.g., other native deps).
//
// If -Zlink-native-libraries=false is set, then the assumption is that an
// external build system already has the native dependencies defined, and it
// will provide them to the linker itself.
if sess.opts.debugging_opts.link_native_libraries {
add_local_native_libraries(cmd, sess, codegen_results);
}
add_upstream_rust_crates::<B>(cmd, sess, codegen_results, crate_type, tmpdir);
if sess.opts.debugging_opts.link_native_libraries {
add_upstream_native_libraries(cmd, sess, codegen_results, crate_type);
}
}

... I can see that we're going to call add_upstream_native_libraries() relatively late. I can probably adjust this loop to defer any call to cmd.link_dylib("c"), if one is to be made, until the end:

let crates = &codegen_results.crate_info.used_crates_static;
for &(cnum, _) in crates {
for lib in codegen_results.crate_info.native_libraries[&cnum].iter() {
let name = match lib.name {
Some(l) => l,
None => continue,
};
if !relevant_lib(sess, &lib) {
continue;
}
match lib.kind {
NativeLibKind::Dylib | NativeLibKind::Unspecified => cmd.link_dylib(name),
NativeLibKind::Framework => cmd.link_framework(name),
NativeLibKind::StaticNoBundle => {
// Link "static-nobundle" native libs only if the crate they originate from
// is being linked statically to the current crate. If it's linked dynamically
// or is an rlib already included via some other dylib crate, the symbols from
// native libs will have already been included in that dylib.
if data[cnum.as_usize() - 1] == Linkage::Static {
cmd.link_staticlib(name)
}
}
// ignore statically included native libraries here as we've
// already included them when we included the rust library
// previously
NativeLibKind::StaticBundle => {}
NativeLibKind::RawDylib => {
// FIXME(#58713): Proper handling for raw dylibs.
bug!("raw_dylib feature not yet implemented");
}
}
}
}

It's perhaps a bit unfortunate that additional user link flags, via add_user_defined_link_args(), can be injected after this processing though -- which is part of why I went with the late link args in the first place:

link_local_crate_native_libs_and_dependent_crate_libs::<B>(
cmd,
sess,
crate_type,
codegen_results,
tmpdir,
);
// OBJECT-FILES-NO, AUDIT-ORDER
if sess.opts.cg.profile_generate.enabled() || sess.instrument_coverage() {
cmd.pgo_gen();
}
// OBJECT-FILES-NO, AUDIT-ORDER
if sess.opts.cg.control_flow_guard != CFGuard::Disabled {
cmd.control_flow_guard();
}
// OBJECT-FILES-NO, AUDIT-ORDER
add_rpath_args(cmd, sess, codegen_results, out_filename);
// OBJECT-FILES-MAYBE, CUSTOMIZATION-POINT
add_user_defined_link_args(cmd, sess);
// NO-OPT-OUT, OBJECT-FILES-NO, AUDIT-ORDER
cmd.finalize();
// NO-OPT-OUT, OBJECT-FILES-MAYBE, CUSTOMIZATION-POINT
add_late_link_args(cmd, sess, flavor, crate_type, codegen_results);

Does that seem more appropriate? Can you think of a better way to enforce this ordering constraint in the linker arguments?

@nagisa
Copy link
Member

nagisa commented Apr 17, 2021

So, the order of the -l arguments the linker receives is deduced naturally from the dependency graph of the crates that contain the #[link(...)] attributes. Since in the ecosystem there's no relationship between the libc/libstd crates and the panic_unwind crate, there's naturally no defined order between the -l* arguments that these libraries imply.

One way to get this to happen could be to, on illumos targets, to have a strategically placed #[link(name="c", ...)] in the panic_unwind crate, so that the ordering constraint between the libgcc_s and libc libraries is deduced from the panic_unwind crate alone.

@jclulow
Copy link
Contributor Author

jclulow commented Apr 17, 2021

Thanks for looking @nagisa!

So, the order of the -l arguments the linker receives is deduced naturally from the dependency graph of the crates that contain the #[link(...)] attributes. Since in the ecosystem there's no relationship between the libc/libstd crates and the panic_unwind crate, there's naturally no defined order between the -l* arguments that these libraries imply.

One way to get this to happen could be to, on illumos targets, to have a strategically placed #[link(name="c", ...)] in the panic_unwind crate, so that the ordering constraint between the libgcc_s and libc libraries is deduced from the panic_unwind crate alone.

Are you saying we would remove this cargo:rustc-link-lib=gcc_s from the unwind build program:

} else if target.contains("illumos") {
println!("cargo:rustc-link-lib=gcc_s");

And instead, at the end of of /~https://github.com/rust-lang/rust/blob/b0c818c5e0fa37251d9fda2f656bf1041a2e1e1d/library/unwind/src/lib.rs we would add something like...

#[cfg(target_os = "illumos")]
#[link(name = "gcc_s")]
#[link(name = "c")]
extern "C" {}

Is my understanding correct?

@nagisa
Copy link
Member

nagisa commented Apr 17, 2021

You can try a similarly strategically placed invocation of

println!("cargo:rustc-link-lib=c");

in the build.rs too.

@petrochenkov
Copy link
Contributor

I forgot mentioning #73319 ("Linked native libraries are sometimes deduplicated and sometimes not deduplicated"), which may require using build scripts instead of (preferred) attributes here.

@jclulow

I think I have found another place to perform this surgery, though.

If the compiler requires special casing for such a regular task like linking ordering, then it is doing something wrong and doesn't provide some necessary user-visible functionality, and we need to add such functionality instead of special cases. (Which probably amounts to fixing #73319 in this specific case, but there's fortunately a workaround with build scripts as I said above.)

@jclulow
Copy link
Contributor Author

jclulow commented Apr 18, 2021

@nagisa Thanks for your help so far! I gave your suggestion a try, with the following patch in place of this PR:

diff --git a/library/unwind/build.rs b/library/unwind/build.rs
index d8bf152e4d6..3bb9682a615 100644
--- a/library/unwind/build.rs
+++ b/library/unwind/build.rs
@@ -33,7 +33,15 @@ fn main() {
     } else if target.contains("solaris") {
         println!("cargo:rustc-link-lib=gcc_s");
     } else if target.contains("illumos") {
+        // Both libgcc_s and libc contain an unwinder implementation, but on
+        // some systems the libc unwinder does not contain all of the extension
+        // symbols in libgcc_s that are used by Rust.  We include libc here
+        // explicitly in an attempt to ensure it will always appear after
+        // libgcc_s in the library search order in the output object, so that we
+        // do not end up using part of an unwinder implementation from two
+        // different libraries.
         println!("cargo:rustc-link-lib=gcc_s");
+        println!("cargo:rustc-link-lib=c");
     } else if target.contains("dragonfly") {
         println!("cargo:rustc-link-lib=gcc_pic");
     } else if target.contains("pc-windows-gnu") {

This did appear to improve the situation with the cargo binary built by the Rust build, in that libc was now after libgcc_s in the NEEDED list. It did not seem to help when I used the built toolchain to build and run the tests for a crate -- in this case, the backtrace crate. We ended up once more with the following list:

$ elfdump target/debug/deps/accuracy-b67fe8242b4d4db7  | grep NEEDED
       [0]  NEEDED            0x17c9b3            libc.so.1
       [1]  NEEDED            0x17ca1c            libm.so.2
       [2]  NEEDED            0x17ca26            libsocket.so.1
       [3]  NEEDED            0x17ca7b            libumem.so.1
       [4]  NEEDED            0x17ca53            libgcc_s.so.1

[ @petrochenkov ] If the compiler requires special casing for such a regular task like linking ordering, then it is doing something wrong and doesn't provide some necessary user-visible functionality, and we need to add such functionality instead of special cases.

The compiler itself doesn't require a special case, and isn't really doing anything wrong. In a traditional Makefile-driven build system where the consumer has control over what flags are passed to the compiler, one would always arrange to place -lc last in the library list, if it appears at all. More than likely, user software would omit it entirely because the C compiler knows to include it -- and, it must be said, libgcc_s and other C compiler runtime libraries if required -- in the appropriate spot.

The runtime linker will (in most simple cases like this) always allow implicit symbol interposition based on the order that libraries appear in the NEEDED list. On our platform in particular, the C library is the gateway to many base facilities like system calls and signal handling, but it also includes other base OS facilities like a minimal unwinder implementation and some stack protector functions. It will likely include other symbols in the future. Whomever (or in the case of Rust, whatever) is deciding what to pass to the linker needs to be aware of this.

I think that adding libc only via the late link arguments is still the simplest way to make sure the right thing happens, because there are two basic constraints here:

  • libc should effectively always appear in the NEEDED list, as a useful program cannot really be built without it
  • libc should appear last in the NEEDED list, so that no implicit interposition of symbols that are intended to come from other libraries (such as the unwinder in gcc_s) is allowed to occur, whether the duplicates exist in libc today or are introduced in the future

Because the Rust toolchain seeks to stand in for the combination of the erstwhile developer assembling manually a set of LD_FLAGS, it likely needs to be able to encode these sorts of constraints on behalf of the consumer.

Does that make sense?

@petrochenkov petrochenkov added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 18, 2021
@petrochenkov
Copy link
Contributor

petrochenkov commented Apr 18, 2021

This makes sense, but I still disagree with some points.

libc should effectively always appear in the NEEDED list, as a useful program cannot really be built without it

You can always use syscalls directly (at cost of less portability), and sometimes you have to, for example when writing some binary instrumentation code that shouldn't interfere with the application code (including libc) or writing a dynamic linker. Cases like this are exactly why #![no_std] in Rust or -nostdlib in C exist (especially on non-bare-metal targets), there should be an option for building without linking to anything.

Whomever (or in the case of Rust, whatever) is deciding what to pass to the linker needs to be aware of this.

In case of Rust it is libstd, libunwind (rust crate) and libc (rust crate), not the compiler.
If -lgcc_s should always go before -lc, then I suggest to do #[link(name = gcc_s)] #[link(name = c)] in the libc crate right now.
Once #73319 is fixed an the dedup modifier is implemented we may be able to do slightly better.

UPD: I'd be ok with the PR changing only the target spec (at least as a temporary step), it is the linker.rs part that makes it unacceptable to me.

@petrochenkov petrochenkov added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 18, 2021
@jclulow
Copy link
Contributor Author

jclulow commented Apr 18, 2021

libc should effectively always appear in the NEEDED list, as a useful program cannot really be built without it

You can always use syscalls directly (at cost of less portability), and sometimes you have to, for example when writing some binary instrumentation code that shouldn't interfere with the application code (including libc) or writing a dynamic linker. Cases like this are exactly why #![no_std] in Rust or -nostdlib in C exist (especially on non-bare-metal targets), there should be an option for building without linking to anything.

That may well be true for some other platforms, but isn't really true on illumos. There isn't really any reason I can think of that would make direct system calls an attractive solution. Indeed, necessary parts of the signal handling machinery (amongst other things) are part kernel and part libc. On top of this, if you did want to emit the assembly required for a system call on a particular version of the kernel, including libc doesn't stop you from doing that if you want.

If you want to instrument a process, we have many better tools for doing just that; e.g., process control through proc(4), or static or dynamic instrumentation through DTrace. You may be interested in our crate for adding USDT probe sites to a Rust program.

There is one case I can think of where libc must be omitted, and that's in kernel modules. For that, though, I believe we will create a kernel-specific target akin to the new x86_64-unknown-none-linuxkernel, as that 's a fundamentally different programming environment.

Whomever (or in the case of Rust, whatever) is deciding what to pass to the linker needs to be aware of this.

In case of Rust it is libstd, libunwind (rust crate) and libc (rust crate), not the compiler.
If -lgcc_s should always go before -lc, then I suggest to do #[link(name = gcc_s)] #[link(name = c)] in the libc crate right now.

I'm trying to make the point that, aside from this specific instance (libgcc_s before libc inhibiting unwinding) we have a general constraint: libc should just be last in the list always. If (when) we add additional symbols to libc in the future they may well conflict with symbols in other libraries; libc is dynamically linked so a binary that works one day might stop working quite the same way later. There's a simple, sure-fire way to produce binaries that are 100% correct in the face of future platform changes: place libc last in the link order.

Once #73319 is fixed an the dedup modifier is implemented we may be able to do slightly better.

UPD: I'd be ok with the PR changing only the target spec (at least as a temporary step), it is the linker.rs part that makes it unacceptable to me.

I don't believe that would have any effect, as the problem is not that libc is sometimes missing from the dependency list, it's that it needs to be last. If we only do the half of this change that adds it to the late link arguments, we would be adding libc a second time under many or most conditions, without preventing it from being inserted earlier in the linker arguments. Depending on how the dedup modifier is implemented, it may well have the same effect; e.g., removing all but the first (rather than last) instance of libc in the link order.

Rust may well grow, in future, a more complete system for allowing targets to express these sorts of platform constraints; e.g., by presenting the full abstract link order to the target code (before it becomes linker arguments) so that it may be post-processed, or by accepting a per-target list of partial ordering rules such as "libc only once, and always after every other library". If such a facility becomes available we would absolutely switch to using that and all of this change could come out straight away!

In the interim, though, this solution has the benefit of being explicit in enacting the constraint, easy to understand and to verify, and doesn't have a blast radius that affects any other platform.

@nagisa
Copy link
Member

nagisa commented Apr 19, 2021

The compiler, ideally, should have as little target specific logic outside the target definitions as possible. As this PR is implemented right now, a pretty core portion of the backend now has a illumos-specific workaround. We had similar work-arounds for other targets and it has been and is a source of hard to address issues. Libraries and users start relying on such workarounds and eventually prevent a proper fix. Most obvious in the case of this PR – crate authors might neglect a #[link(name="c") anywhere in their crate graph and come to rely on compiler always linking -lc for them. I think that's a compatibility hazard that should be avoided if possible.

I don't think it is useful here to discuss whether linking to libc at all times makes sense or not, however. While Rust does have a goal of "just" working out of the box, it also has a goal of enabling people to achieve unusual things. If they want to not link to libc – it should be possible to do so. The conventional way to do so today is to use #[no_std], and this PR would break that to an extent (though I guess on illumos we're still unconditionally linking ssp, regardless).

I can see that the requirement is to have NEEDED libc.so occur last, but It isn't clear to me how this translates to requirements for the linker invocation. The proposed solution deduplicates the -lc arguments to the linker and puts it last. Is that really the only way to achieve the end goal? For example, what happens if we do not deduplicate -lc and the linker command line ends up with multiple -lc, one of which is the last -l flag? Can you please specify the requirement in terms of what the invocation of the linker must look like?

@petrochenkov
Copy link
Contributor

Deduplication happens in favor of the last use (assuming it happens).

Also, we can make the linking order of libraries from unwind crate and libc crate predictable by strategically ordering their extern crate items in libstd (yes, it's relying on some implementation-defined behavior in rustc, but standard library can do that).

So -lc can be always placed last using library measures only.

@jclulow
Copy link
Contributor Author

jclulow commented Apr 19, 2021

Deduplication happens in favor of the last use (assuming it happens).

Also, we can make the linking order of libraries from unwind crate and libc crate predictable by strategically ordering their extern crate items in libstd (yes, it's relying on some implementation-defined behavior in rustc, but standard library can do that).

Do you mean by swapping the order of these two extern crate directives:

rust/library/std/src/lib.rs

Lines 354 to 361 in 1a6c98e

#[doc(masked)]
#[allow(unused_extern_crates)]
extern crate libc;
// We always need an unwinder currently for backtraces
#[doc(masked)]
#[allow(unused_extern_crates)]
extern crate unwind;

@petrochenkov
Copy link
Contributor

Yeah, those two.

@jclulow
Copy link
Contributor Author

jclulow commented Apr 19, 2021

I've built a toolchain with the following delta:

diff --git a/library/std/src/lib.rs b/library/std/src/lib.rs
index 90603cd9836..1f15af0c58b 100644
--- a/library/std/src/lib.rs
+++ b/library/std/src/lib.rs
@@ -348,15 +348,19 @@
 #[allow(unused_imports)] // macros from `alloc` are not used on all platforms
 #[macro_use]
 extern crate alloc as alloc_crate;
-#[doc(masked)]
-#[allow(unused_extern_crates)]
-extern crate libc;

 // We always need an unwinder currently for backtraces
 #[doc(masked)]
 #[allow(unused_extern_crates)]
 extern crate unwind;

+// On some platforms an unwinder is present in both libc and another library;
+// e.g., in libgcc_s on illumos.  Depend on libc after unwind in order to arrive
+// at the correct library search order in the output object; i.e., libc last.
+#[doc(masked)]
+#[allow(unused_extern_crates)]
+extern crate libc;
+
 // During testing, this crate is not actually the "real" std library, but rather
 // it links to the real std library, which was compiled from this same source
 // code. So any lang items std defines are conditionally excluded (or else they
diff --git a/library/unwind/build.rs b/library/unwind/build.rs
index d8bf152e4d6..3bb9682a615 100644
--- a/library/unwind/build.rs
+++ b/library/unwind/build.rs
@@ -33,7 +33,15 @@ fn main() {
     } else if target.contains("solaris") {
         println!("cargo:rustc-link-lib=gcc_s");
     } else if target.contains("illumos") {
+        // Both libgcc_s and libc contain an unwinder implementation, but on
+        // some systems the libc unwinder does not contain all of the extension
+        // symbols in libgcc_s that are used by Rust.  We include libc here
+        // explicitly in an attempt to ensure it will always appear after
+        // libgcc_s in the library search order in the output object, so that we
+        // do not end up using part of an unwinder implementation from two
+        // different libraries.
         println!("cargo:rustc-link-lib=gcc_s");
+        println!("cargo:rustc-link-lib=c");
     } else if target.contains("dragonfly") {
         println!("cargo:rustc-link-lib=gcc_pic");
     } else if target.contains("pc-windows-gnu") {

This appears to be sufficient to get the correct NEEDED list in all of the binaries shipped with the toolchain. It still isn't working for the backtrace crate tests, though. I have isolated one of the tests that doesn't work, and used our LD_ALTEXEC facility to redirect linker execution to a debugging script:

#!/bin/bash

(
        date
        pargs $$
        echo
        unset LD_ALTEXEC
        /usr/bin/ld -Dfiles,unused "$@"
        rc=$?
        echo
        echo exit: $rc
        echo
) >>/var/tmp/capld.args.txt 2>&1

exit $rc

I then did cargo clean and cargo test, and here are the arguments that are passed to ld to build target/debug/deps/accuracy-b67fe8242b4d4db7:

...
argv[94]: -Bdynamic
argv[95]: -lc
argv[96]: -lm
argv[97]: -lrt
argv[98]: -lpthread
argv[99]: -lsocket
argv[100]: -lposix4
argv[101]: -lpthread
argv[102]: -lresolv
argv[103]: -lnsl
argv[104]: -lumem
argv[105]: -lgcc_s
argv[106]: -lc
argv[107]: -lc
argv[108]: -lm
argv[109]: -lrt
argv[110]: -lpthread
argv[111]: -lssp
argv[112]: /opt/gcc-10/lib/gcc/x86_64-pc-solaris2.11/10.2.0/crtend.o
argv[113]: /usr/lib/amd64/crtn.o

From the debugging output of ld we can see that we use the library where it first appears in the command line, and ignore subsequent entries that map to the same object. We can also see that specified libraries that don't end up being referenced are omitted (like ssp):

$ grep 'debug:.*\.so' /var/tmp/capld.args.txt
debug: file=/lib/amd64/libc.so  [ ET_DYN ]
debug: file=/lib/amd64/libm.so  [ ET_DYN ]
debug: file=/lib/amd64/librt.so  [ ET_DYN ]
debug: file=/lib/amd64/libpthread.so  [ ET_DYN ]
debug: file=/lib/amd64/libsocket.so  [ ET_DYN ]
debug: file=/lib/amd64/libposix4.so;  skipped: already processed as /lib/amd64/librt.so
debug: file=/lib/amd64/libpthread.so;  skipped: already processed
debug: file=/lib/amd64/libresolv.so  [ ET_DYN ]
debug: file=/lib/amd64/libnsl.so  [ ET_DYN ]
debug: file=/lib/amd64/libumem.so  [ ET_DYN ]
debug: file=/usr/gcc/10/lib/amd64/libgcc_s.so  [ ET_DYN ]
debug: file=/lib/amd64/libc.so;  skipped: already processed
debug: file=/lib/amd64/libc.so;  skipped: already processed
debug: file=/lib/amd64/libm.so;  skipped: already processed
debug: file=/lib/amd64/librt.so;  skipped: already processed
debug: file=/lib/amd64/libpthread.so;  skipped: already processed
debug: file=/usr/gcc/10/lib/amd64/libssp.so  [ ET_DYN ]
debug: file=/lib/amd64/libmd.so.1  [ ET_DYN ]
debug: file=/lib/amd64/libmp.so.2  [ ET_DYN ]
debug: file=librt.so.1  unused: does not satisfy any references
debug: file=libpthread.so.1  unused: does not satisfy any references
debug: file=libresolv.so.2  unused: does not satisfy any references
debug: file=libnsl.so.1  unused: does not satisfy any references
debug: file=libssp.so.0  unused: does not satisfy any references

The full output is very long so I have placed it in a gist: https://gist.github.com/jclulow/462c838eb717acae7bc87db6165359ec

The final NEEDED list from the above linker invocation is:

 $ elfdump -d target/debug/deps/accuracy-b67fe8242b4d4db7 | grep NEEDED
       [0]  NEEDED            0x17c9b0            libc.so.1
       [1]  NEEDED            0x17ca19            libm.so.2
       [2]  NEEDED            0x17ca23            libsocket.so.1
       [3]  NEEDED            0x17ca78            libumem.so.1
       [4]  NEEDED            0x17ca50            libgcc_s.so.1

I'm a bit of a loss as to what do next to fix this, and would appreciate some guidance. Thanks!

@petrochenkov petrochenkov removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Apr 20, 2021
@jclulow
Copy link
Contributor Author

jclulow commented Apr 26, 2021

Yes, the libc change needs to be released so everyone including libstd and libbacktrace starts using.
Yes, looks correct, any crate in the tree depending on libc crate will bring an additional -lc that won't be deduplicated by the current compiler, that's why we need to prefix every such -lc with -lgcc_s.

I would like to avoid a workaround that would require us to go and revisit the many crates that we've already put platform support work into. PRs can take weeks or months to get merged sometimes, and there's nothing stopping any crate in the ecosystem from including a dependency on the native libc to get access to some function that now needs to be modified to avoid causing problems. Plus, once the Rust toolchain is eventually fixed, we'd have to go around and undo the workaround everywhere.

Can we please take another look at the original patch I proposed, with the full understanding that when whatever larger project to improve linker interaction and library deduplication is completed, this can be a temporary fix that we can remove from the code?

@petrochenkov
Copy link
Contributor

The root problem is not even linking ordering, but that two different libraries in the dependency tree have public symbols that have same names but are not incompatible.

If you have any two such conflicting libraries a and b, then it will ruin any automated dependency-based build system be it cargo with #[link] attributes or cmake with target_link_libraries, and any such library pair will need a custom treatment (which I don't want to put into the compiler).

@jclulow
Copy link
Contributor Author

jclulow commented Apr 26, 2021

The root problem is not even linking ordering, but that two different libraries in the dependency tree have public symbols that have same names but are not incompatible.

If you have any two such conflicting libraries a and b, then it will ruin any automated dependency-based build system be it cargo with #[link] attributes or cmake with target_link_libraries, and any such library pair will need a custom treatment (which I don't want to put into the compiler).

This isn't really true, though, because both libc and libgcc_s are both somewhat special in the vast majority of (C-centric) software and build systems. In particular, I believe it is relatively uncommon for software to pass -nodefaultlibs to GCC, as Rust is doing here, and thus GCC can itself take care of which combination of libgcc_s and libc to include, and in which order (which is, generally, just done at the end of the linker library list). Cmake-based build systems generally don't have to figure out where to pass -lc, it lets the C compiler handle that.

@petrochenkov
Copy link
Contributor

@jclulow

I would like to avoid a workaround that would require us to go and revisit the many crates that we've already put platform support work into.

Nothing like this is required, the only thing we need to update by publishing a new patch version is the libc crate.
All other crates will then pull the updated version of libc automatically (unless they have Cargo.lock committed).

@jclulow
Copy link
Contributor Author

jclulow commented Apr 26, 2021

OK, what about:

I can add a new field to struct TargetOptions, something like ignore_libraries: Option<Vec<String>>. If this is present in the target options, and the name of the library that's being added (from any crate, external or builtin) is in the list, it would be silently ignored rather than passed to the linker. Then, the specifics of the library name can be entirely contained in the spec for illumos; we would provide ignore_libraries of ["c"], and add -lc in explicitly to the late link args as we've been allowed to do with -lssp and which has been working well. Because we pass -z ignore to the linker, in the unlikely event that no symbols from libc are used it will be dropped from the NEEDED list anyway, so nostd binaries that somehow don't depend on making system calls will continue to be possible.

In this way, we can get what we need, and the link.rs and linker.rs files would remain free of specific library names. Platform-specific constants would remain in the spec area. Would that be OK?

@petrochenkov petrochenkov added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 26, 2021
@petrochenkov
Copy link
Contributor

petrochenkov commented May 4, 2021

Status update: #83507 will land soon and for #38460/#73319 we'll likely end up with an aggressive deduplication strategy by default and an opt-out with a modifier.

So, yeah, let's land some temporary hack that can be easily reverted later without touching other repos like libc.
This PR as is would work, but could you add some big FIXMEs to the added code telling that it's supposed to be replaced by some more generic mechanism.

@petrochenkov petrochenkov added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 4, 2021
@bors

This comment has been minimized.

Under some conditions, the toolchain will produce a sequence of linker
arguments that result in a NEEDED list that puts libc before libgcc_s;
e.g.,

    [0]  NEEDED            0x2046ba            libc.so.1
    [1]  NEEDED            0x204723            libm.so.2
    [2]  NEEDED            0x204736            libsocket.so.1
    [3]  NEEDED            0x20478b            libumem.so.1
    [4]  NEEDED            0x204763            libgcc_s.so.1

Both libc and libgcc_s provide an unwinder implementation, but libgcc_s
provides some extra symbols upon which Rust directly depends.  If libc
is first in the NEEDED list we will find some of those symbols in libc
but others in libgcc_s, resulting in undefined behaviour as the two
implementations do not use compatible interior data structures.

This solution is not perfect, but is the simplest way to produce correct
binaries on illumos for now.
@jclulow jclulow force-pushed the illumos-link-order branch from 13bc380 to 31c2ad0 Compare May 7, 2021 00:09
@petrochenkov
Copy link
Contributor

@bors r+ rollup

@bors
Copy link
Contributor

bors commented May 7, 2021

📌 Commit 31c2ad0 has been approved by petrochenkov

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 7, 2021
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request May 7, 2021
…chenkov

illumos should put libc last in library search order

Under some conditions, the toolchain will produce a sequence of linker
arguments that result in a NEEDED list that puts libc before libgcc_s;
e.g.,

    [0]  NEEDED            0x2046ba            libc.so.1
    [1]  NEEDED            0x204723            libm.so.2
    [2]  NEEDED            0x204736            libsocket.so.1
    [3]  NEEDED            0x20478b            libumem.so.1
    [4]  NEEDED            0x204763            libgcc_s.so.1

Both libc and libgcc_s provide an unwinder implementation, but libgcc_s
provides some extra symbols upon which Rust directly depends.  If libc
is first in the NEEDED list we will find some of those symbols in libc
but others in libgcc_s, resulting in undefined behaviour as the two
implementations do not use compatible interior data structures.

This solution is not perfect, but is the simplest way to produce correct
binaries on illumos for now.
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request May 7, 2021
…chenkov

illumos should put libc last in library search order

Under some conditions, the toolchain will produce a sequence of linker
arguments that result in a NEEDED list that puts libc before libgcc_s;
e.g.,

    [0]  NEEDED            0x2046ba            libc.so.1
    [1]  NEEDED            0x204723            libm.so.2
    [2]  NEEDED            0x204736            libsocket.so.1
    [3]  NEEDED            0x20478b            libumem.so.1
    [4]  NEEDED            0x204763            libgcc_s.so.1

Both libc and libgcc_s provide an unwinder implementation, but libgcc_s
provides some extra symbols upon which Rust directly depends.  If libc
is first in the NEEDED list we will find some of those symbols in libc
but others in libgcc_s, resulting in undefined behaviour as the two
implementations do not use compatible interior data structures.

This solution is not perfect, but is the simplest way to produce correct
binaries on illumos for now.
bors added a commit to rust-lang-ci/rust that referenced this pull request May 7, 2021
Rollup of 9 pull requests

Successful merges:

 - rust-lang#84254 (illumos should put libc last in library search order)
 - rust-lang#84442 (Unify rustc and rustdoc parsing of `cfg()`)
 - rust-lang#84655 (Cleanup of `wasm`)
 - rust-lang#84866 (linker: Avoid library duplication with `/WHOLEARCHIVE`)
 - rust-lang#84930 (rename LLVM target for RustyHermit)
 - rust-lang#84991 (rustc: Support Rust-specific features in -Ctarget-feature)
 - rust-lang#85029 (SGX mutex is movable)
 - rust-lang#85030 (Rearrange SGX split module files)
 - rust-lang#85033 (some further small cleanups)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit cd37b95 into rust-lang:master May 7, 2021
@rustbot rustbot added this to the 1.54.0 milestone May 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants