Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize unchecked indexing into chunks and chunks_mut #86823

Merged
merged 1 commit into from
Jul 8, 2021

Conversation

the8472
Copy link
Member

@the8472 the8472 commented Jul 2, 2021

Fixes #53340

# BEFORE

$ rustc +nightly -Copt-level=3 -Ccodegen-units=1 -Clto=fat chunks.rs
$ perf stat ./chunks

 Performance counter stats for './chunks':

          3,177.03 msec task-clock                #    1.000 CPUs utilized
                 4      context-switches          #    0.001 K/sec
                 0      cpu-migrations            #    0.000 K/sec
           984,006      page-faults               #    0.310 M/sec
    13,092,199,322      cycles                    #    4.121 GHz                      (83.29%)
       384,543,475      stalled-cycles-frontend   #    2.94% frontend cycles idle     (83.35%)
     7,414,280,722      stalled-cycles-backend    #   56.63% backend cycles idle      (83.38%)
    50,493,980,662      instructions              #    3.86  insn per cycle
                                                  #    0.15  stalled cycles per insn  (83.29%)
     6,625,375,297      branches                  # 2085.396 M/sec                    (83.38%)
         3,087,652      branch-misses             #    0.05% of all branches          (83.31%)

       3.178079469 seconds time elapsed

       2.327156000 seconds user
       0.762041000 seconds sys

# AFTER

$ ./build/x86_64-unknown-linux-gnu/stage1/bin/rustc -Copt-level=3 -Ccodegen-units=1 -Clto=fat chunks.rs
$ perf stat ./chunks

 Performance counter stats for './chunks':

          2,705.76 msec task-clock                #    1.000 CPUs utilized
                 4      context-switches          #    0.001 K/sec
                 0      cpu-migrations            #    0.000 K/sec
           984,005      page-faults               #    0.364 M/sec
    11,156,763,039      cycles                    #    4.123 GHz                      (83.26%)
       342,198,882      stalled-cycles-frontend   #    3.07% frontend cycles idle     (83.37%)
     6,486,263,637      stalled-cycles-backend    #   58.14% backend cycles idle      (83.37%)
    40,553,476,617      instructions              #    3.63  insn per cycle
                                                  #    0.16  stalled cycles per insn  (83.37%)
     6,668,429,113      branches                  # 2464.532 M/sec                    (83.37%)
         3,099,636      branch-misses             #    0.05% of all branches          (83.26%)

       2.706725288 seconds time elapsed

       1.782083000 seconds user
       0.848424000 seconds sys

@rust-highfive
Copy link
Collaborator

r? @kennytm

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 2, 2021
@the8472
Copy link
Member Author

the8472 commented Jul 2, 2021

Grepping through the compiler source doesn't show any uses of chunks or chunks_mut, even less so in combination with Zip, so I don't really expect a change in perf rlo results, but maybe some dependencies do.

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 2, 2021
@bors
Copy link
Contributor

bors commented Jul 2, 2021

⌛ Trying commit 24094a0 with merge 5c392fe307a7b9c6ca1d328ad7dbed69fb03897d...

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@klensy
Copy link
Contributor

klensy commented Jul 2, 2021

Comparing CI build nightly with manually build stage1 can show unrelated results.

@the8472
Copy link
Member Author

the8472 commented Jul 2, 2021

I'd expect the compiler internals to be built differently, but shouldn't the output of the compiler remain comparable?

@bors
Copy link
Contributor

bors commented Jul 2, 2021

☀️ Try build successful - checks-actions
Build commit: 5c392fe307a7b9c6ca1d328ad7dbed69fb03897d (5c392fe307a7b9c6ca1d328ad7dbed69fb03897d)

@rust-timer
Copy link
Collaborator

Queued 5c392fe307a7b9c6ca1d328ad7dbed69fb03897d with parent 2545459, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (5c392fe307a7b9c6ca1d328ad7dbed69fb03897d): comparison url.

Summary: This benchmark run did not return any significant changes.

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 3, 2021
@the8472 the8472 added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Jul 4, 2021
@kennytm
Copy link
Member

kennytm commented Jul 8, 2021

@bors r+

@bors
Copy link
Contributor

bors commented Jul 8, 2021

📌 Commit 24094a0 has been approved by kennytm

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 8, 2021
@bors
Copy link
Contributor

bors commented Jul 8, 2021

⌛ Testing commit 24094a0 with merge 0cd0709...

@bors
Copy link
Contributor

bors commented Jul 8, 2021

☀️ Test successful - checks-actions
Approved by: kennytm
Pushing 0cd0709 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jul 8, 2021
@bors bors merged commit 0cd0709 into rust-lang:master Jul 8, 2021
@rustbot rustbot added this to the 1.55.0 milestone Jul 8, 2021
@pedrocr
Copy link

pedrocr commented Jul 8, 2021

Awesome, thanks for working on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance regression in tight loop since rust 1.25
8 participants