-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missed optimization: references from pointers aren't treated as noalias #38941
Comments
It might be due to /issues/31681 |
Do you have a play link demonstrating noalias taking effect? AFAIK noalias annotations were completely removed after 1.7.0 due to the llvm bugs. Here is your foo with some bits and pieces to defeat whole-program optimisation:
If you compile that in release mode and look at the LLVM IR, you'll see that foo does not have noalias annotations and performs the store twice. |
Here's a demo: https://godbolt.org/g/pSj51i pub fn foo(a: &mut i32, b: &mut i32, x: &i32) {
*a = *x;
*b = *x;
} compiles into: example::foo:
push rbp
mov rbp, rsp
mov eax, dword ptr [rdx]
mov dword ptr [rdi], eax
mov dword ptr [rsi], eax
pop rbp
ret |
Right, sorry, I was being dim - you're referring to the load only happening once. This is probably because immutable references are marked as readonly in llvm (take a look at the debug llvm IR of my example). If you make x be Your last example has all pointer arguments which do not get annotated, even if they're |
I don't think the
It clearly states that the memory may be written to (but only through a different pointer, not through the argument marked as AFAIK, this optimization should only be possible if Indeed, if you compile the following C code: void foo(int *a, int *b, const int* x) {
*a = *x;
*b = *x;
} The generated LLVM IR marks
Actually, your point about making However, it doesn't answer the question why |
Good issue. noalias on &mut is unrelated, sorry for the misdirection. I'd look at
I'm not sure there is a place where noalias can be placed For this code: #![crate_type="lib"]
pub fn foo(a: &mut i32, b: &mut i32, x: *const i32) {
unsafe {
let r = &*x;
*a = *r;
*b = *r;
}
} This is produced ( ; ModuleID = 'noalias.cgu-0.rs'
source_filename = "noalias.cgu-0.rs"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
; Function Attrs: uwtable
define void @_ZN7noalias3foo17h1c56d659e6e1ab4bE(i32* dereferenceable(4), i32* dereferenceable(4), i32*) unnamed_addr #0 {
entry-block:
%_0 = alloca {}
br label %start
start: ; preds = %entry-block
%3 = load i32, i32* %2
store i32 %3, i32* %0
%4 = load i32, i32* %2
store i32 %4, i32* %1
ret void
}
attributes #0 = { uwtable } |
This may be what you're looking for - #16515 |
Perhaps someone can tag this issue as performance related? How can non-maintainers help to do such routine work? I vaguely remember @TimNN doing this a lot for issues here. |
Unfortunately, GitHub doesn't let anyone do issue triage without having a commit bit, it's kind of annoying. If you email me a list of issues and changes that should be made, I can manually do it on your behalf, for now. It's not great. I'd like to solve this somehow, but there's so much work to do... |
Off-topic |
So in which way is this issue not a duplicate of #16515? Cc @rust-lang/wg-codegen |
In the related Stack Overflow question by the OP, an answer was provided: pub unsafe fn f(a: *mut i32, b: *mut i32, x: *const i32) {
(|safe_a: &mut i32, safe_b: &mut i32, safe_x: &i32| {
*safe_a = *safe_x;
*safe_b = *safe_x;
})(&mut *a, &mut *b, &*x)
} This results in the LLVM IR: ; playground::f1
; Function Attrs: nounwind uwtable
define void @_ZN10playground2f117h829c91c07d14df5dE(i32* %a, i32* %b, i32* readonly %x) unnamed_addr #1 {
start:
%0 = icmp ne i32* %a, null
tail call void @llvm.assume(i1 %0)
%1 = icmp ne i32* %b, null
tail call void @llvm.assume(i1 %1)
%2 = icmp ne i32* %x, null
tail call void @llvm.assume(i1 %2)
%x.val = load i32, i32* %x, align 4
store i32 %x.val, i32* %a, align 4, !alias.scope !0, !noalias !3
store i32 %x.val, i32* %b, align 4, !alias.scope !3, !noalias !0
ret void
} Note that this contains The "normal" way of writing this: pub unsafe fn f2(a: *mut i32, b: *mut i32, x: *const i32) {
let safe_a: &mut i32 = &mut *a;
let safe_b: &mut i32 = &mut *b;
let safe_x: &i32 = &*x;
*safe_a = *safe_x;
*safe_b = *safe_x;
} Has no such metadata: define void @_ZN10playground2f217h6945a283c9ef76d3E(i32* nocapture %a, i32* nocapture %b, i32* nocapture readonly %x) unnamed_addr #0 {
start:
%0 = load i32, i32* %x, align 4
store i32 %0, i32* %a, align 4
%1 = load i32, i32* %x, align 4
store i32 %1, i32* %b, align 4
ret void
} It appears that noalias information only applies at function boundaries. |
Do we have an issue for |
@nagisa Oops, the way I read the start of the comment, I didn't think to check, my bad. |
#16515 is titled "make use of LLVM's scoped noalias metadata" We are making use of it, for function arguments. We are not making use of it in every case. If the issues are the same, perhaps one of y'all would be good enough to improve the title of that issue to make it more discoverable, and maybe update it to have a list of places that should use the metadata and don't. Otherwise, it reads a lot like a nebulous "make the compiler make faster code" issue. |
"metadata" is not function argument/return attributes, but rather the |
I am missing something, please help me understand what. There's literally
Zooming and enhancing:
|
@shepmaster LLVM generates that itself after inlining One more recent example is I wanted use to emit |
Nominated for compiler team discussion - how should we track these instances of #16515? |
So we very briefly discussed this at the T-compiler meeting. I'm not sure what path we want to use going forward to track work items like this. One possibility is having separate issues .. and then making #16515 a metabug tracking them? Another possibility is a separate github project. Not sure if that setup is helpful or harmful for a topic like this. @eddyb were there any particular thoughts/ideas you had? Or would you simply be disposed to close this and open a fresh tracking issue, as you mentioned above? |
The original example now produces, eliding the second read f:
mov eax, dword ptr [rdx]
mov dword ptr [rdi], eax
mov dword ptr [rsi], eax
ret as of Rust 1.78. Invoking with an aliasing |
We still don't emit any |
Now (with current nightly) pub unsafe fn f(a: *mut i32, b: *mut i32, x: *const i32) {
*a = *x;
*b = *x;
} compiled (-O) to f:
mov eax, dword ptr [rdx]
mov dword ptr [rdi], eax
mov dword ptr [rsi], eax
ret so pub fn g(a: *mut i32, b: *mut i32, x: *const i32) {
let safe_a = unsafe { &mut *a };
let safe_b = unsafe { &mut *b };
let safe_x = unsafe { &*x };
*safe_a = *safe_x;
*safe_b = *safe_x;
} produces the same assembly https://godbolt.org/z/oa557jqcj , so I suppose the issue can be closed? |
The reason So, what we need here is a better example; the issue almost certainly is still present. |
I have updated the example. But I am not sure if it's worth keeping the issue, this is basically a duplicate of #16515. |
The following code results in
a
being dereferenced again for the return value: (https://godbolt.org/z/YnrzMa3oj)That is to be expected. In contrast, the same function with references avoids the load: (https://godbolt.org/z/YnrzMa3oj)
However, if we change the code to the following: (https://godbolt.org/z/YnrzMa3oj)
then the extra dereference is not optimized out as it should be, because do not emit anything like
noalias
inside functions.The text was updated successfully, but these errors were encountered: