Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delay include_bytes to AST lowering #103812

Merged
merged 1 commit into from
Nov 12, 2022

Conversation

clubby789
Copy link
Contributor

Hopefully addresses #65818.
This PR introduces a new ExprKind::IncludedBytes which stores the path and bytes of a file included with include_bytes!(). We can then create a literal from the bytes during AST lowering, which means we don't need to escape the bytes into valid UTF8 which is the cause of most of the overhead of embedding large binary blobs.

@rustbot
Copy link
Collaborator

rustbot commented Oct 31, 2022

r? @TaKO8Ki

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 31, 2022
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rustbot
Copy link
Collaborator

rustbot commented Oct 31, 2022

Some changes occurred in src/tools/clippy

cc @rust-lang/clippy

Some changes occurred in src/tools/rustfmt

cc @rust-lang/rustfmt

@jyn514
Copy link
Member

jyn514 commented Oct 31, 2022

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 31, 2022
@bors
Copy link
Contributor

bors commented Oct 31, 2022

⌛ Trying commit 261c30cc4c962434dc5608ba767ed7b3cd7a35be with merge f3fc6066fb6e47915c9c0f888fbb7a66f620e14f...

@bors
Copy link
Contributor

bors commented Nov 1, 2022

☀️ Try build successful - checks-actions
Build commit: f3fc6066fb6e47915c9c0f888fbb7a66f620e14f (f3fc6066fb6e47915c9c0f888fbb7a66f620e14f)

@rust-timer
Copy link
Collaborator

Queued f3fc6066fb6e47915c9c0f888fbb7a66f620e14f with parent 95a3a72, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (f3fc6066fb6e47915c9c0f888fbb7a66f620e14f): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
4.2% [4.2%, 4.2%] 1
Improvements ✅
(primary)
-3.8% [-3.8%, -3.8%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -3.8% [-3.8%, -3.8%] 1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.5% [3.5%, 3.5%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Nov 1, 2022
@Kobzol
Copy link
Contributor

Kobzol commented Nov 1, 2022

This looks great! Could you please try to benchmark some code that includes a large binary file with and without this PR? We don't have a benchmark for this present in the rustc-perf suite at the moment.

@clubby789
Copy link
Contributor Author

With this program:

static BYTES: &[u8] = include_bytes!("out");

fn main() {
    println!("{}", BYTES.len());
}

Using GNU time to record the time and max RSS:

> dd if=/dev/urandom of=src/out bs=500M count=1
...
> cargo clean && /usr/bin/time -f 'mem %M, user %Us' cargo +nightly build
   Compiling big_inc v0.1.0 (/tmp/big_inc)
    Finished dev [unoptimized + debuginfo] target(s) in 10.52s
mem 4265648, user 9.55s
> cargo clean && /usr/bin/time -f 'mem %M, user %Us' cargo +stage1 build
   Compiling big_inc v0.1.0 (/tmp/big_inc)
    Finished dev [unoptimized + debuginfo] target(s) in 3.39s
mem 2755012, user 2.70s

@clubby789
Copy link
Contributor Author

clubby789 commented Nov 6, 2022

Looking at this a bit further, around 50% of the remaining compile time after this patch is spent in /~https://github.com/llvm/llvm-project/blob/244331ae833aaf33503bbd36890e704afb66a237/llvm/lib/IR/Constants.cpp#L2970-L2971 - not sure if that could be improved at all.

EDIT: To the best of my knowledge, this is due to LLVM hashing the entire constant byte by byte. This is also done on the Rust side at some points (hashing query results), but the impact is less severe.

@est31
Copy link
Member

est31 commented Nov 7, 2022

@clubby789 people have sent LLVM patches in the past to improve things for Rust (edit: in fact this occurs all the time). If you want to change llvm too and need help, you could maybe ask the llvm working group on zulip.

@Kobzol
Copy link
Contributor

Kobzol commented Nov 7, 2022

Hopefully the LLVM code could be improved (e.g. a faster hash function), but I wouldn't block this PR on this now, as this PR already seems to improve the situation for the Rust side a lot.

@petrochenkov
Copy link
Contributor

This is very similar to adding a new AST node for format_args!, see the discussion in https://rust-lang.zulipchat.com/#narrow/stream/233931-t-compiler.2Fmajor-changes/topic/Make.20format_args!.28.29.20its.20own.20AST.20node.20.28ast.E2.80.A6.20compiler-team.23541/near/295952151 and below.

What this node needs to have is a "canonical token representation", into which it turns one any attempt at tokenization, which includes pretty printing.
Thankfully, in this case the token representation is pretty simple.

Also, every use of ast::ExprKind::Lit needs to be audited and equivalent logic should be added for ast::ExprKind::IncludedBytes if relevant. It's hard to verify whether it was done or not from the diff.

@petrochenkov
Copy link
Contributor

r? @petrochenkov
@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 10, 2022
@clubby789
Copy link
Contributor Author

clubby789 commented Nov 10, 2022

Also, every use of ast::ExprKind::Lit needs to be audited and equivalent logic should be added for ast::ExprKind::IncludedBytes if relevant. It's hard to verify whether it was done or not from the diff.

The initial diff just changed some matches until it compiled - I've now gone through all references to ExprKind::Lit and made sure IncludedBytes is handled (like a ByteStr) where appropriate

@petrochenkov petrochenkov added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Nov 10, 2022
compiler/rustc_ast/src/ast.rs Outdated Show resolved Hide resolved
compiler/rustc_ast/src/ast.rs Show resolved Hide resolved
compiler/rustc_ast/src/mut_visit.rs Show resolved Hide resolved
compiler/rustc_ast_lowering/src/expr.rs Outdated Show resolved Hide resolved
src/tools/rustfmt/src/expr.rs Outdated Show resolved Hide resolved
compiler/rustc_expand/src/build.rs Outdated Show resolved Hide resolved
compiler/rustc_ast_pretty/src/pprust/state.rs Outdated Show resolved Hide resolved
compiler/rustc_expand/src/proc_macro_server.rs Outdated Show resolved Hide resolved
@petrochenkov petrochenkov added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 11, 2022
@petrochenkov
Copy link
Contributor

Thanks!
r=me after squashing commits and rebasing.
@rustbot author

@clubby789 clubby789 force-pushed the improve-include-bytes branch from e3d8936 to b2da155 Compare November 11, 2022 16:32
@petrochenkov
Copy link
Contributor

@bors r+

@bors
Copy link
Contributor

bors commented Nov 11, 2022

📌 Commit b2da155 has been approved by petrochenkov

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Nov 11, 2022
@bors
Copy link
Contributor

bors commented Nov 12, 2022

⌛ Testing commit b2da155 with merge 8ef2485...

@bors
Copy link
Contributor

bors commented Nov 12, 2022

☀️ Test successful - checks-actions
Approved by: petrochenkov
Pushing 8ef2485 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Nov 12, 2022
@bors bors merged commit 8ef2485 into rust-lang:master Nov 12, 2022
@rustbot rustbot added this to the 1.67.0 milestone Nov 12, 2022
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (8ef2485): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.7% [3.7%, 3.7%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.4% [-2.4%, -2.4%] 1
All ❌✅ (primary) - - 0

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.3% [2.3%, 2.3%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.3% [2.3%, 2.3%] 1

Aaron1011 pushed a commit to Aaron1011/rust that referenced this pull request Jan 6, 2023
…etrochenkov

Delay `include_bytes` to AST lowering

Hopefully addresses rust-lang#65818.
This PR introduces a new `ExprKind::IncludedBytes` which stores the path and bytes of a file included with `include_bytes!()`. We can then create a literal from the bytes during AST lowering, which means we don't need to escape the bytes into valid UTF8 which is the cause of most of the overhead of embedding large binary blobs.
@clubby789 clubby789 deleted the improve-include-bytes branch February 11, 2023 14:44
@Kobzol
Copy link
Contributor

Kobzol commented Jul 1, 2023

FWIW, I tried to replace djbHash with xxHash on the benchmark posted above, here are the results:

xxHash
Benchmark 1: cargo clean && cargo +local build
  Time (mean ± σ):      3.864 s ±  0.042 s    [User: 1.898 s, System: 2.269 s]
  Range (min … max):    3.749 s …  3.902 s    10 runs

djbHash
Benchmark 1: cargo clean && cargo +local build
  Time (mean ± σ):      4.307 s ±  0.025 s    [User: 2.244 s, System: 2.366 s]
  Range (min … max):    4.273 s …  4.351 s    10 runs

(I shouldn't also measure the clean step, but w/e, this is unscientific anyway :) ).

There is a clear improvement, but it's nothing really big (given the fact that we hash a stress test - 500 MiB file!). So I'm not sure if it's worth it to switch it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.