-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Experiment] Inline "is on heap?" checks for checked write barriers #111561
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
@EgorBot -azure_ampere -azure_cobalt100 -profiler using System.Runtime.CompilerServices;
using BenchmarkDotNet.Attributes;
public class Benchmark
{
private MyStruct _heap1;
private MyLargeStruct _heap2;
[Benchmark]
public void ReturnBufferWB_Smallstruct_stack() => Smallstruct(default);
[Benchmark]
public void ReturnBufferWB_Smallstruct_heap() => _heap1 = Smallstruct(default);
[Benchmark]
public void ReturnBufferWB_Largestruct_stack() => Largestruct(default);
[Benchmark]
public void ReturnBufferWB_Largestruct_heap() => _heap2 = Largestruct(default);
[MethodImpl(MethodImplOptions.NoInlining)]
static MyStruct Smallstruct(MyStruct ms) => ms;
[MethodImpl(MethodImplOptions.NoInlining)]
static MyLargeStruct Largestruct(MyLargeStruct ms) => ms;
}
public struct MyStruct
{
public object Fld1;
public object Fld2;
public object Fld3;
}
public struct MyLargeStruct
{
public object Fld1;
public object Fld2;
public object Fld3;
public object Fld4;
public object Fld5;
public object Fld6;
public object Fld7;
public object Fld8;
} |
If GC is triggered right after |
Size diffs won't be nice, but preliminary arm64 perf results look very encouraging? When I brought up return barrier cost I had assumed keeping returns unmanaged would help but it's more than I expected. (and there's still a heap check in here!) Flipping the trade-off would help with the code size issue, assuming the copy mostly trades out with the removal of the write barrier call. Though I understand if such a broad change is just too involved or risky. This PR is nice in any case, as I have also seen things like try patterns show up in my profiling. Thanks for picking this up @EgorBo |
Yep, I had it in mind, this expansion happens late enough to be able to arrange nogc, but I wonder if it's possible for the GC boundaries to be just constant? Like GC reserves a huge memory region and the boundaries represent reserved, not comitted? thus, we can bake them as constant
It's important to keep in mind that struct copies for large blocks aren't cheap even without write barriers, so an extra struct copy might ruin the idea Overall, I am not planning to continue for now, I was mostly interested in perf numbers. I think we need something realistic to measure impact of it on. Also, @AndyAyersMS mentioned that at some point we might start emitting more (if not all) checked barriers due to IPO escape analsysis, so this might help there as well |
GC does that in some modes today. We would want to add information about this to WriteBarrierParameters. |
Yes larger structs may get hit by this, though I'm not sure how common those are, they're advised against either way. The perf numbers show most gains for the smaller structs, this is aligns with my own investigations. Just to reiterate, it's frustrating to have managed struct hot paths today, if not *everything* can be inlined perf reduces significantly. It's not what you expect defining something as simple as (unmanaged header, ROSeq buffer). Using a pooled class to avoid these return costs is not always possible, and can easily end up being more expensive due to the actual heap barriers you now incur to update the fields. |
Let's see if it works out
Current codegen:
New codegen: