Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow "artifact dependencies" on bin, cdylib, and staticlib crates #3028

Merged
merged 34 commits into from
Jan 23, 2021
Merged
Changes from 2 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
b921f08
Add RFC for dependencies on binaries
jyn514 Nov 30, 2020
fac809f
Clarify what `lib` means
jyn514 Dec 1, 2020
d45fb22
Add note about the alternative of a hardcoded binary path
joshtriplett Dec 8, 2020
0670af0
Add SPIR-V shaders as an example
joshtriplett Dec 13, 2020
9bc481d
Clarify that type and crate name transformations happen in both cases
joshtriplett Dec 13, 2020
1056622
Move long multi-sentence parenthetical to a sub-bullet
joshtriplett Dec 13, 2020
c6fde3c
Change the case for omitting the artifact suffix
joshtriplett Dec 13, 2020
b06d9cb
Discuss interaction with hypothetical automatic handling of staticlib…
joshtriplett Dec 13, 2020
42ae02f
Remove trailing space
joshtriplett Dec 13, 2020
8faaab8
Discuss alternative of omitting `"lib"`
joshtriplett Dec 14, 2020
6d95b60
Add note about handling of cross-compiled artifact dependencies
joshtriplett Dec 14, 2020
2e1aabf
We may want `env_os!` or `env_path!`
joshtriplett Dec 14, 2020
32f1a50
Remove unclear use case
jyn514 Dec 16, 2020
983638e
Elaborate on the use case for depending on a binary
joshtriplett Dec 16, 2020
c547a58
Add a use case for tools needed for testing
joshtriplett Dec 16, 2020
22c8a57
Mention Swift packages and "products"
joshtriplett Dec 16, 2020
9aebb5a
Remove a link from the use cases; the text explains the use case better
joshtriplett Dec 16, 2020
c46b068
Discuss alternative of recursive cargo or alternative build systems
joshtriplett Dec 16, 2020
442dcb0
Clarify a mention of "binary" to cover other artifact types
joshtriplett Dec 16, 2020
4e6f6ea
Provide an example for the artifact name matching the crate name
joshtriplett Dec 16, 2020
7f51fc4
Change `type` to `artifact` everywhere, and split out "lib"
joshtriplett Dec 16, 2020
58577cb
Use "artifact dependency" consistently, for clarity
joshtriplett Dec 17, 2020
ff2b254
Drop `bins` in favor of `artifact = "bin:name"`
joshtriplett Dec 17, 2020
82baedf
Elaborate on the LD_PRELOAD and VDSO use cases
joshtriplett Dec 17, 2020
0f3933b
Minor grammar tweak
joshtriplett Dec 17, 2020
6c84436
Clarify scope statement
joshtriplett Dec 17, 2020
eaf80fb
Add a cross-reference from drawbacks to alternatives
joshtriplett Dec 17, 2020
be85559
Expand an example
joshtriplett Dec 17, 2020
93c3094
Adjust a missed `<CRATE>` to `<DEP>` to match the rest
joshtriplett Dec 17, 2020
c9470ad
Document alternative syntaxes for specifying dependencies on specific…
joshtriplett Dec 17, 2020
a42fead
Propose specific artifact paths for the implementation
joshtriplett Dec 17, 2020
98e1476
Document the profile settings used to build artifact dependencies
joshtriplett Dec 17, 2020
2ddc224
Document what happens when building for multiple targets at once
joshtriplett Dec 24, 2020
278de5a
Merging RFC 3028
ehuss Jan 23, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
144 changes: 144 additions & 0 deletions text/0000-cargo-binary-dependencies.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
- Feature Name: (`bindeps`)
- Start Date: 2020-11-30
- RFC PR: [rust-lang/rfcs#3028](/~https://github.com/rust-lang/rfcs/pull/3028)
- Rust Issue: [rust-lang/rust#0000](/~https://github.com/rust-lang/rust/issues/0000)

# Summary
[summary]: #summary

Allow Cargo packages to depend on `bin`, `cdylib`, and `staticlib` crates, and use the artifacts built by those crates.
joshtriplett marked this conversation as resolved.
Show resolved Hide resolved

# Motivation
[motivation]: #motivation

There are many different possible use cases.

- [Testing the behavior of a binary](/~https://github.com/rust-lang/cargo/issues/4316#issuecomment-361641639). Currently, this requires invoking `cargo build` recursively, or running `cargo build` before running `cargo test`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this RFC is necessary in this case for binaries defined in the local package. All integration test targets implicitly depend on all binary targets in a package today.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @Twey might have been talking about using binaries in unit tests. I'll let them describe their use case, I don't know anything other than the linked comment.

- [Running a binary that depends on another](/~https://github.com/rust-lang/rustc-perf/tree/master/collector#how-to-benchmark-a-change-on-your-own-machine). Currently, this requires running `cargo build`, making it difficult to keep track of when the binary was rebuilt. The use case for `rustc-perf` is to have a main binary that acts as an 'executor', which executes `rustc` many times, and a smaller 'shim' which wraps `rustc` with additional environment variables and arguments.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this perhaps be expanded a bit? I'm not really sure what this is referring to, because cargo build builds all the binaries today in a package. Does rustc-perf have different packages with binaries that depend on each other?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, the binaries depend on each other, so you can't use cargo run because that requires a specific binary. cargo build does work, but means you have to run the binary as target/release/collector, so it's easy to forget to recompile after a change.

- [Building tools needed at build time](/~https://github.com/rust-lang/rust/pull/79540#unresolved-questions). Currently, this requires either splitting the tool into a library crate (if written in Rust), or telling the user to install the tool on the host and detecting the availability of it. This feature would allow building the necessary tool from source and then invoking it from a `build.rs` script later in the build.
- Building and embedding binaries for another target, such as firmware or WebAssembly. This feature would allow a versioned dependency on an appropriate crate providing the firmware or WebAssembly binary, and then embedding the binary (or a compressed or otherwise transformed version of it) into the final crate. For instance, a virtual machine could build its system firmware, or a WebAssembly runtime could build helper libraries.
- Building and embedding a shared library for use at runtime. For instance, a binary could depend on a shared library used with [`LD_PRELOAD`](https://man7.org/linux/man-pages/man8/ld.so.8.html#ENVIRONMENT), or used in the style of the Linux kernel's [VDSO](https://man7.org/linux/man-pages/man7/vdso.7.html).
joshtriplett marked this conversation as resolved.
Show resolved Hide resolved

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

Cargo allows you to depend on binary or C ABI artifacts of another package; this is known as a "binary dependency" or "artifact dependency". For example, you can depend on the `cmake` binary in your `build.rs` like so:

```toml
[build-dependencies]
cmake = { version = "1.0", type = "bin" }
```

Cargo will build the `cmake` binary, then make it available to your `build.rs` through an environment variable:

```rust
// build.rs
use std::{env, process::Command};

fn main() {
let cmake_path = env::var_os("CARGO_BIN_FILE_CMAKE_cmake").expect("cmake binary");
let mut cmake = Command::new(cmake_path).arg("--version");
assert!(cmake.status().expect("cmake --version failed").success());
}
```

You can optionally specify specific binaries to depend on using `bins`:

```toml
[build-dependencies]
cmake = { version = "1.0", type = "bin", bins = ["cmake"] }
```

If no binaries are specified, all the binaries in the package will be built and made available.

You can obtain the directory containing all binaries built by the `cmake` crate with `CARGO_BIN_DIR_CMAKE`, such as to add it to `$PATH` before invoking another build system or a script.

Cargo also allows depending on `cdylib` or `staticlib` artifacts. For example, you can embed a dynamic library in your binary:

```rust
// main.rs
const MY_PRELOAD_LIB: &[u8] = include_bytes!(env!("CARGO_CDYLIB_FILE_MYPRELOAD"));
```

Note that cargo cannot help you ensure these artifacts are available at runtime for an installed version of a binary; cargo can only supply these artifacts at build time. Runtime requirements for installed crates are out of scope for this change.
joshtriplett marked this conversation as resolved.
Show resolved Hide resolved

If you need to depend on multiple variants of a crate, such as both the binary and library of a crate, you can supply an array of strings for `type`: `type = ["bin", "lib"]`.

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

There are four `type`s available:
1. `"lib"`, the default
joshtriplett marked this conversation as resolved.
Show resolved Hide resolved
2. `"bin"`, a crate building one or more binaries
3. `"cdylib"`, a C-compatible dynamic library
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little worried about cdylib/staticlib here in the sense that these often have canonical meanings which Cargo would otherwise not be interpreting. For example if you depend on a lib Cargo will pass all the right --extern flags and make sure that it makes its way to the final binary. With a cdylib and/or staticlib you may want that as well.

For example you might want to consume a crate's C API and want to have that automatically linked into the final format. By only supporting cdylib and staticlib here and nothing else about these crate types I'd fear that we'd get into a situation where depending on one of these crate types involves a lot of boilerplate that Cargo could otherwise handle if we took some time to game out what it meant.

Is it possible to game out here what the main use cases are here and see if we can improve the Cargo experience?

Copy link
Member

@joshtriplett joshtriplett Dec 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcrichton This isn't intended for the use case where you just want to link your binary (or library) to the produced library. For that use case, I absolutely agree that cargo should provide more integration. This is intended for the use case where the library serves some other function, such as a runtime library for an interpreter (linked into interpreted programs rather than just the interpreter), or a VDSO for a kernel (linked into userspace programs run on that kernel), or a preloaded library for use with LD_PRELOAD, or a remote debugging stub injected into another program or device. In those cases, you really do just want the compiled artifact.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the problem lies exactly within type="lib" providing some kind of integration (adding --extern to rustc invocations) and type="cdylib" or type="staticlib" not doing so. Or vice versa. This difference is not easily inferred by just reading a toml file that contains both or one of the settings.

And yet the options change enough about the build process that anybody who attempts to understand the dependency structure needs to be very aware of the difference in behaviour here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nagisa I don't think it's reasonably possible to have "default" handling for cdylib, because linking a shared library doesn't guarantee it'll be available at runtime.

That said, if we believe there's some value in having non-artifact dependencies on cdylib or staticlib, with automatic handling, we could flag artifact dependencies in some way. For instance, type="artifact:cdylib". We could then reserve type="cdylib" for something else. I don't think that's necessary, but it's a reasonable possibility, and this feature would still be suitable for the intended purposes.

Copy link
Member

@nagisa nagisa Dec 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option that comes to mind is to just remove type="lib" for now – making the default integrated beaviour only happen when the type is absent; or perhaps make it behave much like other types, in that it would only provide the rlib as an artifact, without any integrated behaviour.

In that case people who want both the "integrated" behaviour as well as an artifact, they could specify the dependency multiple times, once with a type and once without.

4. `"staticlib"`, a C-compatible static library

`"lib"` corresponds to all crates that can be depended on currently,
including `lib`, `rlib`, and `proc-macro` libraries.
See [linkage](https://doc.rust-lang.org/reference/linkage.html) for more information.

Artifact dependencies can appear in any of the three sections of dependencies (or in target-specific versions of these sections):
- `[build-dependencies]`
- `[dependencies]`
- `[dev-dependencies]`

By default, `build-dependencies` are built for the host, while `dependencies` and `dev-dependencies` are built for the target. You can specify the `target` attribute to build for a specific target, such as `target = "wasm32-wasi"`; a literal `target = "target"` will build for the target even if specifing a build dependency. (If the target is not available, this will result in an error at build time, just as if building the specified crate with a `--target` option for an unavailable target.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One worry I would have about supplying a target feature is that this sort of attribute is often better placed on the package itself. For example if you're depending on a wasm blob the package that compiles to wasm is unlikely to ever not want to get compiled to wasm, but this means that you'd have to specify the target every single time you depend on it.

I think that this sort of attribute on a dependency would work but it isn't clearly the best solution available to us I think. I think it would be worthwhile to explore the use cases of this and see if they'd be better suited with a dependency-level or package-level target declaration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting something like [package] default-target = "wasm32-wasi"? I'd be interested in that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One worry I would have about supplying a target feature is that this sort of attribute is often better placed on the package itself. For example if you're depending on a wasm blob the package that compiles to wasm is unlikely to ever not want to get compiled to wasm, but this means that you'd have to specify the target every single time you depend on it.

I disagree: for power-cpu-sim (linked above), it is specifically designed to be able to compile to a command-line program for linux/etc. and the exact same crate is also designed to be compiled for the web using target wasm32-wasi and there is an example web server which runs on linux and requires the produced wasm file to run.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, I'm not saying that a dependency-level directive isn't useful. Nor does one reason why it is useful mean it's more useful than a package-level directive. My point is that the package-level directive is probably useful in more cases.

@jyn514 yes that's what I'm thinking (subject to bikeshedding of course)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcrichton A package-level directive for the default also makes sense, but that would be a separate proposal. If a package truly supports only one target, a package-level directive would suffice. The target attribute on a dependency is intended for cases where the dependency supports multiple targets.

For example, consider a crate that builds virtual machine firmware, and supports multiple architectures. If you're building a virtual machine for a specific architecture, you need firmware for that architecture.

Or, consider a kernel that supports both 64-bit and 32-bit userspace applications, and needs to build different versions of a VDSO for each case.

I also expect target = "target" on build dependencies to be quite common: "no, I don't want to run this as part of the build process, I want this to be runnable on the target system instead".

I would also expect to see cases like this:

[build-dependencies]
thing32 = { package = "thing", version = "...", target = "i686-unknown-linux-gnu" }
thing64 = { package = "thing", version = "...", target = "x86_64-unknown-linux-gnu" }

joshtriplett marked this conversation as resolved.
Show resolved Hide resolved

Cargo provides the following environment variables to the crate being built:

- `CARGO_<TYPE>_DIR_<CRATE>`, where `<TYPE>` is the `type` of the artifact (uppercased) and `<CRATE>` is the package of the crate being depended on. (As with other Cargo environment variables, crate names are converted to uppercase, with dashes replaced by underscores.) This is the directory containing all the artifacts from the crate.
joshtriplett marked this conversation as resolved.
Show resolved Hide resolved
- `CARGO_<TYPE>_FILE_<CRATE>_<ARTIFACT>`, where `<TYPE>` is the `type` of the artifact, `<CRATE>` is the package of the crate being depended on, and `<ARTIFACT>` is the name of the artifact. (Note that `<ARTIFACT>` is *not* modified in any way from the `name` specified in the crate supplying the artifact, or the crate name if not specified; for instance, it may be in lowercase, or contain dashes.) This is the full path to the artifact.
- For the crate types `cdylib` and `staticlib` that can (currently) only build one artifact, cargo additionally supplies this variable with the `_<ARTIFACT>` suffix omitted.
joshtriplett marked this conversation as resolved.
Show resolved Hide resolved

For each kind of dependency, these variables are supplied to the same part of the build process that has access to that kind of dependency:
- For `build-dependencies`, these variables are supplied to the `build.rs` script, and can be accessed using `std::env::var_os`. (As with any OS file path, these may or may not be valid UTF-8.)
- For `dependencies`, these variables are supplied during the compilation of the crate, and can be accessed using `env!`.
- For `dev-dependencies`, these variables are supplied during the compilation of examples, tests, and benchmarks, and can be accessed using `env!`.

(See the "Future possibilities" section for a note about the use of `env!`.)

Similar to features, if other crates in your dependencies also depend on the same binary crate, and request different binaries, Cargo will build the union of all binaries requested.

Cargo will unify features and versions across dependencies of all types, just as it does for multiple dependencies on the same crate throughout a dependency tree.

`type` may be a string, or a list of strings; in the latter case, this specifies a dependency on the crate with each of those types, and is equivalent to specifying multiple dependencies with different `type` values. For instance, you may specify a build dependency on both the binary and library of the same crate. You may also specify separate dependencies of different `type`s; for instance, you may have a build dependency on the binary of a crate and a dependency on the library of the same crate.

Cargo does not take the specified `type` values into account when resolving a crate's version; it will resolve the version as normal, and then produce an error if that version does not support all the specified `type` values. Similarly, Cargo will produce an error if that version does not build all the binaries required by the `bins` value. Removing a crate type or binary is a semver-incompatible change. (Any further semver requirements on the interface provided by a binary or library depend on the nature of the binary or library in question.)

Until this feature is stabilized, it will require specifying the nightly-only option `-Z bindeps` to `cargo`. If `cargo` encounters a binary dependency or artifact dependency and does not have this option specified, it will emit an error and immediately stop building.

# Drawbacks
[drawbacks]: #drawbacks

Some of the motivating use cases have alternative solutions, such as extracting a library from a tool written in Rust, and making the tool a thin wrapper around the library. Making this change may potentially reduce the motivation to extract such libraries. However, many of the other use cases do not currently have any solutions, and extracted libraries have additional value even after this feature becomes available, so we don't see this as a reason to avoid introducing this feature.
joshtriplett marked this conversation as resolved.
Show resolved Hide resolved

Adding this feature will make Cargo usable for many more use cases, which may motivate people to use Cargo in more places and stretch it even further; this may, in turn, generate more support and more feature requests.

# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

This RFC proposes supplying both the root directory and the path to each specific artifact. The path to specific artifacts is useful for accessing that specific artifact, and avoids needing target-specific knowledge about the names of executables (`.exe`) or libraries (`lib*.so`, `*.dll`, ...). The root directory is useful for `$PATH`, `$LD_LIBRARY_PATH`, and similar. Going from one to the other requires making assumptions. We believe there's value in supplying both.

We could specify a `target = "host"` value to build for the host even for `[dependencies]` or `[dev-dependencies]` which would normally default to building for the target. If any use case arises for such a dependency, we can easily add that.

We could make information about artifact dependencies in `[dependencies]` available to the `build.rs` script, which would allow running arbitrary Rust code to work with such dependencies at build time (rather than being limited to `env!`, proc macros, and constant evaluation). However, we can achieve the same effect with an entry in `[build-dependencies]` that has `target = "target"`, and that model seems simpler to explain and to work with.

# Prior art
[prior-art]: #prior-art

- Cargo already provides something similar to this for C library dependencies of -sys crates. A `-sys` crate can supply arbitrary artifact paths, for libraries, headers, and similar. Crates depending on the `-sys` crate can obtain those paths via environment variables supplied via Cargo, such as to compile other libraries using the same C library. This proposal provides a similar feature for other types of crates and libraries.
- `make`, `cmake`, and many other build systems allow setting arbitrary goals as the dependencies of others. This allows building a binary and then running that binary in a rule that depends on that binary.

# Unresolved questions
[unresolved-questions]: #unresolved-questions

How easily can Cargo handle a dependency with a different target specified? How will that interact with dependency resolution? Cargo already has to handle dependencies for both host and target (for cross-compilation), so those cases should already work.

# Future possibilities
[future-possibilities]: #future-possibilities

Currently, there's no mechanism to obtain an environment variable's value at compile time if that value is not valid UTF-8. In the future, we may want an `env_os!` macro, analogous to `std::env::var_os`, which returns a `&'static OsStr` rather than a `&'static str`. This is already an issue for existing environment variables supplied to the build that contain file paths.

In some cases, a crate may want to depend on a binary without unifying features or dependency versions with that binary. A future extension to this mechanism could allow cargo to build a binary crate in isolation, without attempting to do any unification.

Just as a `-sys` crate can supply additional artifacts other than the built binary, this mechanism could potentially expand in the future to allow building artifacts other than the built binary, such as C-compatible include files, various types of interface definition or protocol definition files, or arbitrary data files.