-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
code less optimized when accessing a specific array element, only on nightly #111508
Comments
searched nightlies: from nightly-2022-11-01 to nightly-2023-05-12 bisected with cargo-bisect-rustc v0.6.6Host triple: x86_64-unknown-linux-gnu cargo bisect-rustc 2022-11-01 --prompt --script ./script.sh It looks like the vec is copied through an alloca instead of just being handled through pointers. The cause is probably the layout changing from |
That's the intended layout. And I wouldn't expect pointer-in-the-middle to optimize better other than by accident. And the optimization failure depends on which array element gets accessed after the vec has been turned into an array. So it shouldn't even be related to the Vec struct layout anymore. So it looks more like spooky action at the distance going on in LLVM. It's also possible that the |
The index 24 seems to be special. Don't know why. There's also something very odd going on in the MIR for this: https://godbolt.org/z/f9Wa46cMr
The Cranking up the MIR inliner seems to improve things, but only if you don't also increase the inlining threshold for functions with |
I am guessing this is on an inflection point of a "if x is less than y, do this" and an "if x is greater than y, do this" heuristic, where one of the cases should be "greater or equal than" or "less than or equal". |
In nightly looking at 24 vs. 25, the only difference in the LLVM IR we generate is the GEP index, as you would expect: https://godbolt.org/z/zYcs9hhYc - %11 = getelementptr inbounds [32 x i8], ptr %a1, i64 0, i64 24
+ %11 = getelementptr inbounds [32 x i8], ptr %a1, i64 0, i64 25
%_8 = load i8, ptr %11, align 1, !noundef !2 |
It looks like it gets SROAd (via bit insert/extract) if the index is 25 but not if it is 24. Haven't looked into why. Something that would also fix this is to forward load from memcpy target to load from memcpy source in GVN, which seems like it would be a good idea to do anyway. |
This is not an issue. This is the |
Looks like this was fixed in 1.72: https://godbolt.org/z/W4j8qx8s7. I don't know why. |
Add codegen tests for E-needs-test close rust-lang#36010 close rust-lang#68667 close rust-lang#74938 close rust-lang#83585 close rust-lang#93036 close rust-lang#109328 close rust-lang#110797 close rust-lang#111508 close rust-lang#112509 close rust-lang#113757 close rust-lang#120440 close rust-lang#118392 close rust-lang#71096 r? nikic
Add codegen tests for E-needs-test close rust-lang#36010 close rust-lang#68667 close rust-lang#74938 close rust-lang#83585 close rust-lang#93036 close rust-lang#109328 close rust-lang#110797 close rust-lang#111508 close rust-lang#112509 close rust-lang#113757 close rust-lang#120440 close rust-lang#118392 close rust-lang#71096 r? nikic
Add codegen tests for E-needs-test close rust-lang#36010 close rust-lang#68667 close rust-lang#74938 close rust-lang#83585 close rust-lang#93036 close rust-lang#109328 close rust-lang#110797 close rust-lang#111508 close rust-lang#112509 close rust-lang#113757 close rust-lang#120440 close rust-lang#118392 close rust-lang#71096 r? nikic
Add codegen tests for E-needs-test close rust-lang#36010 close rust-lang#68667 close rust-lang#74938 close rust-lang#83585 close rust-lang#93036 close rust-lang#109328 close rust-lang#110797 close rust-lang#111508 close rust-lang#112509 close rust-lang#113757 close rust-lang#120440 close rust-lang#118392 close rust-lang#71096 r? nikic
#130726 "unfixes" this if it lands, unfortunately. |
I tried this code on rustc nightly (
rustc 1.71.0-nightly (c4190f2d3 2023-05-07)
) on godbolt and the assembly code contains some unnecessary checks, but not when changing the value ofN
to anything different than 24 (but not out of bounds fora
) or changing the version to 1.69.0 or betahttps://godbolt.org/z/WMWv5s57o
The text was updated successfully, but these errors were encountered: