-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eliminate excessive null-checks from slice iterators #21886
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
From the description I take it this doesn't suffer from the compile time problem mentioned in #21418 (comment)? (Out of interest what changed in LLVM? Just general optimisations?) Also, I found that the compiler aborted compiling libcore (can't remember if it was stage1 or stage2) when I did a similar change (#21448); this doesn't suffer from that? |
Right, I compared compile times for (IIRC) rustc_trans and rustc_typeck and there was no measureable difference in compile times with the assume in stage 2 (or maybe it was stage 1 that I tested, shouldn't matter). I also just bootstrapped a new version of #21418 and that took ~24 minutes from finishing stage0 libcore to finishing stage2 librustc. Considering that just rustc_trans took 13 minutes when the slowdown was there, it looks good to me ;-)
I don't know.
The benchmark was taken with the stage2 rustc, so no, it didn't abort. |
Could you add the assume to |
Why I originally had it in |
Oh, huh. I totally misread this PR. I was thinking this was in |
fb2e27b
to
a2dcc1e
Compare
I suspect that means that this doesn't solve a case like http://is.gd/pteFUM, .LBB0_2:
testq %rdx, %rdx
je .LBB0_4
addl (%rdx), %eax
addq $4, %rdx
addq $-4, %rcx
jne .LBB0_2 since there's no |
Will try to see if that helps |
@@ -427,8 +428,12 @@ impl<T> Vec<T> { | |||
#[stable(feature = "rust1", since = "1.0.0")] | |||
pub fn as_mut_slice<'a>(&'a mut self) -> &'a mut [T] { | |||
unsafe { | |||
let ptr = *self.ptr; | |||
if cfg!(not(stage0)) { // NOTE remove cfg after next snapshot | |||
assume(ptr != 0 as *mut T); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ptr::null_mut() here and all other uses of 0 as *mut T
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, using is_null()
works now, that didn't work the first time I tried, either because of an error on my side or because of the ptrtoint
instruction it was generated with the old is_null()
implementation.
Thanks!
a2dcc1e
to
e49f6d6
Compare
At least in this case, the |
@bors r+ e49f |
@bors: rollup |
@bors: rollup- |
⌛ Testing commit e49f6d6 with merge cd0d956... |
💔 Test failed - auto-win-32-nopt-t |
@huonw, that fold testcase seems to have been resolved now, it vectorizes! ( |
The failure seems legitimate; although, possibly a LLVM bug? |
Yeah, I can reproduce that in my Windows VM. |
Yup, LLVM bug: http://reviews.llvm.org/D7533 |
Fixes the crash blocking rust-lang#21886.
LLVM update landed. Needs rebase. |
Casting the pointer to an integer requires a ptrtoint, while casting 0 to a pointer is directly folded to a `null` value.
The data pointer used in the slice is never null, using assume() to tell LLVM about it gets rid of various unneeded null checks when iterating over the slice. Since the snapshot compiler is still using an older LLVM version, omit the call in stage0, because compile times explode otherwise. Benchmarks from rust-lang#18193 ```` running 5 tests test _range ... bench: 33329 ns/iter (+/- 417) test assembly ... bench: 33299 ns/iter (+/- 58) test enumerate ... bench: 33318 ns/iter (+/- 83) test iter ... bench: 33311 ns/iter (+/- 130) test position ... bench: 33300 ns/iter (+/- 47) test result: ok. 0 passed; 0 failed; 0 ignored; 5 measured ```` Fixes rust-lang#18193
e49f6d6
to
7412d1b
Compare
⌛ Testing commit 7412d1b with merge c103604... |
💔 Test failed - auto-win-32-nopt-t |
@bors: retry |
The data pointer used in the slice is never null, using assume() to tell LLVM about it gets rid of various unneeded null checks when iterating over the slice. Since the snapshot compiler is still using an older LLVM version, omit the call in stage0, because compile times explode otherwise. Benchmarks from #18193 ```` running 5 tests test _range ... bench: 33329 ns/iter (+/- 417) test assembly ... bench: 33299 ns/iter (+/- 58) test enumerate ... bench: 33318 ns/iter (+/- 83) test iter ... bench: 33311 ns/iter (+/- 130) test position ... bench: 33300 ns/iter (+/- 47) test result: ok. 0 passed; 0 failed; 0 ignored; 5 measured ```` Fixes #18193
💔 Test failed - auto-win-32-nopt-t |
@bors: retry |
The data pointer used in the slice is never null, using assume() to tell LLVM about it gets rid of various unneeded null checks when iterating over the slice. Since the snapshot compiler is still using an older LLVM version, omit the call in stage0, because compile times explode otherwise. Benchmarks from rust-lang#18193 ```` running 5 tests test _range ... bench: 33329 ns/iter (+/- 417) test assembly ... bench: 33299 ns/iter (+/- 58) test enumerate ... bench: 33318 ns/iter (+/- 83) test iter ... bench: 33311 ns/iter (+/- 130) test position ... bench: 33300 ns/iter (+/- 47) test result: ok. 0 passed; 0 failed; 0 ignored; 5 measured ```` Fixes rust-lang#18193
This adds the assume() calls back that got lost when rebasing rust-lang#21886.
This adds the assume() calls back that got lost when rebasing #21886.
Could someone, please, point me to a tracking issue or a documentation about the excessive null-checks? I want to learn more about the cause of extra pub fn foo(s: &[u32]) -> u32 {
s.iter().sum()
} pub fn foo(s: &[u32]) -> u32 {
s.iter().sum()
} pub fn foo(s: Vec<i32>) -> usize {
s.len()
} and even pub fn foo(_s: Vec<i32>) -> usize {
0
} produces assembly code which might have been avoided: playground::foo:
movq 8(%rdi), %rsi
testq %rsi, %rsi
je .LBB0_2
pushq %rax
movq (%rdi), %rdi
shlq $2, %rsi
movl $4, %edx
callq __rust_dealloc@PLT
addq $8, %rsp
.LBB0_2:
xorl %eax, %eax
retq |
In both these cases the length, rather than pointer is being checked. For the |
Indeed, I completely forgot that the owned object must be deallocated here.
Right... @nagisa I am sorry for the trouble! Rust is awesome! :) |
The data pointer used in the slice is never null, using assume() to tell
LLVM about it gets rid of various unneeded null checks when iterating
over the slice.
Since the snapshot compiler is still using an older LLVM version, omit
the call in stage0, because compile times explode otherwise.
Benchmarks from #18193
Fixes #18193