Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: chained guarded devirtualization #51890

Merged
merged 2 commits into from
Apr 28, 2021

Conversation

AndyAyersMS
Copy link
Member

When expanding a guarded devirtualization call site, scout ahead to see if
there's another site just after this one. If so, expand the second site
so that the frequently-taken paths from the first and second tests form a
join-free "hot path". Continue scouting and chaining candidates along this
path until we run out of candidates, find a big enough stretch without a
candidate, or find a candidate with low likelihood of success.

Chaining enhances the abilities of the redundant branch optimizer to clean
up redundant type tests, so if there are multiple virtual or interface
calls in a short span, all dispatching off the same object, one test will
suffice to cover all the calls.

When expanding a guarded devirtualization call site, scout ahead to see if
there's another site just after this one. If so, expand the second site
so that the frequently-taken paths from the first and second tests form a
join-free "hot path". Continue scouting and chaining candidates along this
path until we run out of candidates, find a big enough stretch without a
candidate, or find a candidate with low likelihood of success.

Chaining enhances the abilities of the redundant branch optimizer to clean
up redundant type tests, so if there are multiple virtual or interface
calls in a short span, all dispatching off the same object, one test will
suffice to cover all the calls.
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 26, 2021
@AndyAyersMS
Copy link
Member Author

cc @dotnet/jit-contrib

This might be easier to explain via pictures. For the current GDV we expand as follows (here S0 and S1 are arbitrary sequences of statements):

Before, when there were two GDV candidates in a block; we'd just repeat this -- see below. Here in the after picture the blue arcs show the direction of most likely flow. Note how the "hot path" of blue intersects with the cold path in the middle. This control flow join makes it tricky for optimization facts from the first devirtualized call to flow down to the second, and also makes it tricky to optimize away the the second type test, if it happens to be redundant with the first.

With this PR, we expand the second GDV candidate differently, if it seems profitable to do so. Starting with the default expansion above, we first duplicate S1 into the two predecessor blocks (left panel below), and then we have the cold path bypass the second type test (right panel below).

Here's a example from the diffs where we create a long chain (left is "current" expansion, right is new expansion; here blue is the fall through path, not the hot path, and the numbers in the blocks are the profile counts). You can see that now all the profiled blocks end up in a long join-free sequence.

@AndyAyersMS
Copy link
Member Author

SPMI diff summary -- diffs in asp.net only. Despite this being a code-expanding transformation (since we're duplicating S1) it usually reduces method size.

Current heuristic requires a likelihood >= 75 to start or continue a chain, and no more than 2 interposing statements. I may look at increasing the interposing count as there are some opportunities out there if we're willing to copy 3 or more statements

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 182307
Total bytes of diff: 180773
Total bytes of delta: -1534 (-0.84% of base)
    diff is an improvement.

191 total methods with Code Size differences (142 improved, 49 regressed), 5 unchanged.
Top file regressions (bytes):
          29 : 26466.dasm (1.84% of base)
          29 : 38902.dasm (1.84% of base)
          17 : 26356.dasm (1.93% of base)
          17 : 39667.dasm (1.93% of base)
          15 : 42356.dasm (0.23% of base)
          13 : 18848.dasm (3.98% of base)
          13 : 32161.dasm (3.98% of base)
          11 : 18802.dasm (4.53% of base)
          11 : 22235.dasm (4.53% of base)
          11 : 38884.dasm (4.58% of base)
          10 : 39968.dasm (4.93% of base)
          10 : 26451.dasm (0.40% of base)
          10 : 26980.dasm (4.93% of base)
           5 : 44477.dasm (0.64% of base)
           5 : 19545.dasm (0.64% of base)
           5 : 28429.dasm (0.64% of base)
           5 : 40609.dasm (0.64% of base)
           5 : 39736.dasm (1.18% of base)
           5 : 39949.dasm (0.95% of base)
           5 : 12028.dasm (0.64% of base)

Top file improvements (bytes):
        -133 : 5795.dasm (-33.67% of base)
        -111 : 22817.dasm (-9.71% of base)
        -111 : 28308.dasm (-8.60% of base)
         -81 : 26435.dasm (-5.67% of base)
         -81 : 39658.dasm (-5.67% of base)
         -75 : 42440.dasm (-5.95% of base)
         -62 : 32201.dasm (-4.64% of base)
         -62 : 18628.dasm (-4.21% of base)
         -58 : 44178.dasm (-4.43% of base)
         -48 : 27326.dasm (-9.06% of base)
         -41 : 19873.dasm (-5.03% of base)
         -31 : 39678.dasm (-1.81% of base)
         -29 : 27352.dasm (-14.87% of base)
         -28 : 26006.dasm (-11.16% of base)
         -26 : 39276.dasm (-4.03% of base)
         -24 : 42379.dasm (-2.80% of base)
         -24 : 26511.dasm (-1.39% of base)
         -24 : 43966.dasm (-2.51% of base)
         -21 : 38949.dasm (-9.95% of base)
         -21 : 26531.dasm (-3.98% of base)

191 total files with Code Size differences (142 improved, 49 regressed), 5 unchanged.

Top method regressions (bytes):
          29 ( 1.84% of base) : 26466.dasm - <InitializeReaderAsync>d__17:MoveNext():this
          29 ( 1.84% of base) : 38902.dasm - <InitializeReaderAsync>d__17:MoveNext():this
          17 ( 1.93% of base) : 26356.dasm - DbContext:get_InternalServiceProvider():IServiceProvider:this
          17 ( 1.93% of base) : 39667.dasm - DbContext:get_InternalServiceProvider():IServiceProvider:this
          15 ( 0.23% of base) : 42356.dasm - <ProxyAsync>d__6:MoveNext():this
          13 ( 3.98% of base) : 18848.dasm - Identity:Equals(Identity):bool:this
          13 ( 3.98% of base) : 32161.dasm - Identity:Equals(Identity):bool:this
          11 ( 4.53% of base) : 18802.dasm - RequestServicesFeature:get_RequestServices():IServiceProvider:this
          11 ( 4.53% of base) : 22235.dasm - RequestServicesFeature:get_RequestServices():IServiceProvider:this
          11 ( 4.58% of base) : 38884.dasm - RequestServicesFeature:get_RequestServices():IServiceProvider:this
          10 ( 4.93% of base) : 39968.dasm - AsyncEnumerator:DisposeAsync():ValueTask:this
          10 ( 0.40% of base) : 26451.dasm - <MoveNextAsync>d__16:MoveNext():this
          10 ( 4.93% of base) : 26980.dasm - AsyncEnumerator:DisposeAsync():ValueTask:this
           5 ( 0.64% of base) : 44477.dasm - HttpConnectionMiddleware`1:OnConnectionAsync(ConnectionContext):Task:this
           5 ( 0.64% of base) : 19545.dasm - HttpConnectionMiddleware`1:OnConnectionAsync(ConnectionContext):Task:this
           5 ( 0.64% of base) : 28429.dasm - HttpConnectionMiddleware`1:OnConnectionAsync(ConnectionContext):Task:this
           5 ( 0.64% of base) : 40609.dasm - HttpConnectionMiddleware`1:OnConnectionAsync(ConnectionContext):Task:this
           5 ( 1.18% of base) : 39736.dasm - RelationalLoggerExtensions:LogCommandExecuting(IDiagnosticsLogger`1,DbCommand,EventDefinition`5)
           5 ( 0.95% of base) : 39949.dasm - RelationalLoggerExtensions:LogCommandExecuted(IDiagnosticsLogger`1,DbCommand,TimeSpan,EventDefinition`6)
           5 ( 0.64% of base) : 12028.dasm - HttpConnectionMiddleware`1:OnConnectionAsync(ConnectionContext):Task:this

Top method improvements (bytes):
        -133 (-33.67% of base) : 5795.dasm - HttpApplication`1:ExecuteAsync(ConnectionContext):Task:this
        -111 (-9.71% of base) : 22817.dasm - JsonMiddleware:Invoke(HttpContext):Task:this
        -111 (-8.60% of base) : 28308.dasm - JsonMiddleware:Invoke(HttpContext):Task:this
         -81 (-5.67% of base) : 26435.dasm - DbContext:Microsoft.EntityFrameworkCore.Internal.IDbContextPoolable.SetLease(DbContextLease):this
         -81 (-5.67% of base) : 39658.dasm - DbContext:Microsoft.EntityFrameworkCore.Internal.IDbContextPoolable.SetLease(DbContextLease):this
         -75 (-5.95% of base) : 42440.dasm - HttpProxy:SetupRequestBodyCopy(HttpRequest,bool,CancellationToken):StreamCopyHttpContent:this
         -62 (-4.64% of base) : 32201.dasm - MiddlewareHelpers:RenderFortunesHtml(IEnumerable`1,HttpContext,HtmlEncoder):Task
         -62 (-4.21% of base) : 18628.dasm - MiddlewareHelpers:RenderFortunesHtml(IEnumerable`1,HttpContext,HtmlEncoder):Task
         -58 (-4.43% of base) : 44178.dasm - HttpProxy:SetupRequestBodyCopy(HttpRequest,bool,CancellationToken):StreamCopyHttpContent:this
         -48 (-9.06% of base) : 27326.dasm - IDiagnosticsLogger:NeedsEventData(EventDefinitionBase,byref,byref):bool:this
         -41 (-5.03% of base) : 19873.dasm - PlainTextActionResult:ExecuteResultAsync(ActionContext):Task:this
         -31 (-1.81% of base) : 39678.dasm - RelationalCommand:CreateDbCommand(RelationalCommandParameterObject,Guid,int):DbCommand:this
         -29 (-14.87% of base) : 27352.dasm - RelationalCommandBuilder:Build():IRelationalCommand:this
         -28 (-11.16% of base) : 26006.dasm - InternalServiceCollectionMap:AddNewDescriptor(IList`1,ServiceDescriptor):this
         -26 (-4.03% of base) : 39276.dasm - PropertyHelper:MakeFastPropertySetter(PropertyInfo):Action`2
         -24 (-2.80% of base) : 42379.dasm - HttpProxy:CopyResponseStatusAndHeadersAsync(HttpResponseMessage,HttpContext,HttpTransformer):ValueTask`1
         -24 (-1.39% of base) : 26511.dasm - RelationalCommand:CreateDbCommand(RelationalCommandParameterObject,Guid,int):DbCommand:this
         -24 (-2.51% of base) : 43966.dasm - HttpProxy:CopyResponseStatusAndHeadersAsync(HttpResponseMessage,HttpContext,HttpTransformer):ValueTask`1
         -21 (-9.95% of base) : 38949.dasm - TypeExtensions:GetParametersCached(MethodBase):ref
         -21 (-3.98% of base) : 26531.dasm - RelationalConnection:ClearTransactions(bool):this

Top method regressions (percentages):
          10 ( 4.93% of base) : 39968.dasm - AsyncEnumerator:DisposeAsync():ValueTask:this
          10 ( 4.93% of base) : 26980.dasm - AsyncEnumerator:DisposeAsync():ValueTask:this
          11 ( 4.58% of base) : 38884.dasm - RequestServicesFeature:get_RequestServices():IServiceProvider:this
          11 ( 4.53% of base) : 18802.dasm - RequestServicesFeature:get_RequestServices():IServiceProvider:this
          11 ( 4.53% of base) : 22235.dasm - RequestServicesFeature:get_RequestServices():IServiceProvider:this
          13 ( 3.98% of base) : 18848.dasm - Identity:Equals(Identity):bool:this
          13 ( 3.98% of base) : 32161.dasm - Identity:Equals(Identity):bool:this
          17 ( 1.93% of base) : 26356.dasm - DbContext:get_InternalServiceProvider():IServiceProvider:this
          17 ( 1.93% of base) : 39667.dasm - DbContext:get_InternalServiceProvider():IServiceProvider:this
          29 ( 1.84% of base) : 26466.dasm - <InitializeReaderAsync>d__17:MoveNext():this
          29 ( 1.84% of base) : 38902.dasm - <InitializeReaderAsync>d__17:MoveNext():this
           5 ( 1.18% of base) : 39736.dasm - RelationalLoggerExtensions:LogCommandExecuting(IDiagnosticsLogger`1,DbCommand,EventDefinition`5)
           5 ( 0.95% of base) : 39949.dasm - RelationalLoggerExtensions:LogCommandExecuted(IDiagnosticsLogger`1,DbCommand,TimeSpan,EventDefinition`6)
           2 ( 0.87% of base) : 40261.dasm - StateManager:Clear():this
           2 ( 0.87% of base) : 27216.dasm - StateManager:Clear():this
           1 ( 0.83% of base) : 38893.dasm - <>c:<CreateConstructorCallSite>b__16_0(ConstructorInfo,ConstructorInfo):int:this
           5 ( 0.65% of base) : 22475.dasm - HttpConnectionMiddleware`1:OnConnectionAsync(ConnectionContext):Task:this
           5 ( 0.64% of base) : 44477.dasm - HttpConnectionMiddleware`1:OnConnectionAsync(ConnectionContext):Task:this
           5 ( 0.64% of base) : 19545.dasm - HttpConnectionMiddleware`1:OnConnectionAsync(ConnectionContext):Task:this
           5 ( 0.64% of base) : 28429.dasm - HttpConnectionMiddleware`1:OnConnectionAsync(ConnectionContext):Task:this

Top method improvements (percentages):
        -133 (-33.67% of base) : 5795.dasm - HttpApplication`1:ExecuteAsync(ConnectionContext):Task:this
         -29 (-14.87% of base) : 27352.dasm - RelationalCommandBuilder:Build():IRelationalCommand:this
         -28 (-11.16% of base) : 26006.dasm - InternalServiceCollectionMap:AddNewDescriptor(IList`1,ServiceDescriptor):this
         -21 (-9.95% of base) : 38949.dasm - TypeExtensions:GetParametersCached(MethodBase):ref
        -111 (-9.71% of base) : 22817.dasm - JsonMiddleware:Invoke(HttpContext):Task:this
         -48 (-9.06% of base) : 27326.dasm - IDiagnosticsLogger:NeedsEventData(EventDefinitionBase,byref,byref):bool:this
        -111 (-8.60% of base) : 28308.dasm - JsonMiddleware:Invoke(HttpContext):Task:this
         -15 (-8.47% of base) : 39889.dasm - HtmlContentBuilderExtensions:SetHtmlContent(IHtmlContentBuilder,IHtmlContent):IHtmlContentBuilder
         -75 (-5.95% of base) : 42440.dasm - HttpProxy:SetupRequestBodyCopy(HttpRequest,bool,CancellationToken):StreamCopyHttpContent:this
         -81 (-5.67% of base) : 26435.dasm - DbContext:Microsoft.EntityFrameworkCore.Internal.IDbContextPoolable.SetLease(DbContextLease):this
         -81 (-5.67% of base) : 39658.dasm - DbContext:Microsoft.EntityFrameworkCore.Internal.IDbContextPoolable.SetLease(DbContextLease):this
         -41 (-5.03% of base) : 19873.dasm - PlainTextActionResult:ExecuteResultAsync(ActionContext):Task:this
         -19 (-4.80% of base) : 37956.dasm - Attribute:GetParentDefinition(PropertyInfo,ref):PropertyInfo
         -19 (-4.70% of base) : 39654.dasm - Controller:OnActionExecutionAsync(ActionExecutingContext,ActionExecutionDelegate):Task:this
         -19 (-4.65% of base) : 22264.dasm - Controller:OnActionExecutionAsync(ActionExecutingContext,ActionExecutionDelegate):Task:this
         -62 (-4.64% of base) : 32201.dasm - MiddlewareHelpers:RenderFortunesHtml(IEnumerable`1,HttpContext,HtmlEncoder):Task
         -58 (-4.43% of base) : 44178.dasm - HttpProxy:SetupRequestBodyCopy(HttpRequest,bool,CancellationToken):StreamCopyHttpContent:this
         -62 (-4.21% of base) : 18628.dasm - MiddlewareHelpers:RenderFortunesHtml(IEnumerable`1,HttpContext,HtmlEncoder):Task
         -26 (-4.03% of base) : 39276.dasm - PropertyHelper:MakeFastPropertySetter(PropertyInfo):Action`2
         -21 (-3.98% of base) : 26531.dasm - RelationalConnection:ClearTransactions(bool):this

@davidfowl
Copy link
Member

SPMI diff summary -- diffs in asp.net only. Despite this being a code-expanding transformation (since we're duplicating S1) it usually reduces method size.

Love seeing ASP.NET in here 👍🏾

@benaadams
Copy link
Member

-133 (-33.67% of base) : 5795.dasm - HttpApplication`1:ExecuteAsync(ConnectionContext):Task:this

Very much appreciating that are including aspnet dlls in the Jit comparisions now 💜

@AndyAyersMS AndyAyersMS mentioned this pull request Apr 26, 2021
54 tasks
Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a nice win!

It's too bad your detailed PR comment, including pictures, can't be part of the code comments. That would probably be helpful later on. Maybe there would be a way to add some textual formulation in the comments?

// Various policies for GuardedDevirtualization
CONFIG_INTEGER(JitGuardedDevirtualizationChainLikelihood, W("JitGuardedDevirtualizationChainLikelihood"), 0x4B) // 75
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears you actually want this in Release builds? For now for forever?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. I'd like to be able to experiment with it for a while in actual product builds.

@@ -5416,7 +5416,8 @@ class Compiler

void fgRemoveEmptyBlocks();

void fgRemoveStmt(BasicBlock* block, Statement* stmt);
void fgRemoveStmt(BasicBlock* block, Statement* stmt, bool isUnlink = false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isUnlink should use DEBUGARG

@@ -513,7 +513,8 @@ class IndirectCallTransformer
{
origCall = GetCall(stmt);

JITDUMP("*** %s contemplating [%06u]\n", Name(), compiler->dspTreeID(origCall));
JITDUMP("\n----------------\n\n*** %s contemplating [%06u] in " FMT_BB " \n", Name(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like indirectcalltransformer.cpp is a poor name for this file now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it is still "just" transforming indirect calls...

// fgUnlinkStmt: unlink a statement from a block's statement list
//
// Arguments:
// block - the block into which 'stmt' will be inserted;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments here, and in fgRemoveStmt, about "be inserted" are wrong; some copy/paste error

@benaadams
Copy link
Member

benaadams commented Apr 27, 2021

Would it work for this (though not suggesting its a common case)

unsafe class Program
{
    int SumFunc(int[] a, int k)
    {
        static bool where(int x, int k) => x > k;
        
        int sum = 0;
        for (var i = 1; i <= 1000; i++)
        {
            sum += Filter(i, k, &where);
        }
        return sum;
    }
    
    unsafe int Filter(int i, int k, delegate*<int, int, bool> f)
    {
        if (f(i, k)) { return 1; }
        return 0;
    }
}

As it looks to resolve where but then not inline it sharplab.io

Program.SumFunc(Int32[], Int32)
    L0000: push ebp
    L0001: mov ebp, esp
    L0003: push edi
    L0004: push esi
    L0005: push ebx
    L0006: mov esi, [ebp+8]
    L0009: xor edi, edi
    L000b: mov ebx, 1
    L0010: mov ecx, ebx
    L0012: mov edx, esi
    L0014: mov eax, Program.<SumFunc>g__where|0_0(Int32, Int32) ; where resolved
    L0019: call eax
    L001b: test eax, eax
    L001d: je short L0026
    L001f: mov eax, 1
    L0024: jmp short L0028
    L0026: xor eax, eax
    L0028: add edi, eax
    L002a: inc ebx
    L002b: cmp ebx, 0x3e8
    L0031: jle short L0010
    L0033: mov eax, edi
    L0035: pop ebx
    L0036: pop esi
    L0037: pop edi
    L0038: pop ebp
    L0039: ret 4

@AndyAyersMS
Copy link
Member Author

Would it work for this....

No, not yet.

Currently indirect call profiling is done at the class level, so indirect calls that aren't virtual or interface calls don't get profiled or optimized. It would not be hard to profile the remaining indirect call targets, but there are some complications because there are multiple "entry points" for methods. So when we profile such a calls site, we'd either have to pay the price of translating the various entry points to the MethodDesc (not sure how doable/costly this is) or defer and do it later on when we got to externalize or analyze the collected profile data. Assuming we ,managed to translate to a MethodDesc, when went to use the data to make a guarded test we'd have to pick one entry to test for (likely the multi-callable entry). Collectible assemblies also complicate things.

Even when the target is obvious the jit is not yet able to optimize indirect calls (see #44610). So it seems like fixing that is a prerequisite.

@AndyAyersMS
Copy link
Member Author

Jit-experimental does not run cleanly, but there are only three tests that fail (across several legs) so it should be simple enough to check that there are no unexpected failures.

// Various policies for GuardedDevirtualization
CONFIG_INTEGER(JitGuardedDevirtualizationChainLikelihood, W("JitGuardedDevirtualizationChainLikelihood"), 0x4B) // 75
CONFIG_INTEGER(JitGuardedDevirtualizationChainStatements, W("JitGuardedDevirtualizationChainStatements"), 2)
#if defined(DEBUG)
Copy link
Member

@EgorBo EgorBo Apr 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean "how many unrelated statements are allowed between two GDV candidates"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes -- it's a (crude) limit on how much other code we're willing to duplicate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm asking because I had:

obj.GDV();
Console.WriteLine();
obj.GDV();

worked

obj.GDV();
Console.WriteLine();
Console.WriteLine();
obj.GDV();

did not 🙂

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I set the statement limit to 2 based on some scouting in SPMI. But given the overall size decrease it might be ok to pick a larger value.

Rough numbers of opportunities / cost vs stmt limit:

limit chained segments dup count
0 29 0
1 218 189
2 262 251
3 307 366
4 313 390
6 320 432
8 321 440

@EgorBo
Copy link
Member

EgorBo commented Apr 27, 2021

Nice! I've cloned your branch and played with it locally, looks cool!

@AndyAyersMS
Copy link
Member Author

The failures in the jit-experimental legs are from known issues (#51898, #43534).

@AndyAyersMS
Copy link
Member Author

Upping the statement limit to 4 reduces the size improvement a bit, but it's still an improvement overall. So we'll go with that initially.

Total bytes of base: 182596
Total bytes of diff: 181411
Total bytes of delta: -1185 (-0.65% of base)
    diff is an improvement.

192 total methods with Code Size differences (138 improved, 54 regressed), 5 unchanged.

@AndyAyersMS
Copy link
Member Author

It's too bad your detailed PR comment, including pictures, can't be part of the code comments.

Not sure what the best solution is here. A big glob of ascii art? https://dot-to-ascii.ggerganov.com/ looked promising, but doesn't seem to like the embedded HTML layouts for the more complex modes (dot "record" nodes don't work out here, unfortunately).

For example:

                      +-----------------+                       +-----------------+
                      |  direct call 1  | <--      S0       --> | indirect call 1 |
                      +-----------------+                       +-----------------+
                        |                                         |
                        |                      type test 1        |
                        v                                         |
+---------------+                                                 |
| direct call 2 | <--         S1          <-----------------------+
+---------------+
  |
  |                       type test 2
  v
+---------------+
|      S2       |
+---------------+
  ^
  |
  |
  |                   +-----------------+
  +------------------ | indirect call 2 |
                      +-----------------+

@AndyAyersMS
Copy link
Member Author

If I use multi line labels instead of going for the record format it works out somewhat better...

+-----------------+     +---------------+
| indirect call 1 |     |      S0       |
|                 | <-- |  type test 1  |
+-----------------+     +---------------+
  |                       |
  |                       |
  |                       v
  |                     +---------------+
  |                     | direct call 1 |
  |                     +---------------+
  |                       |
  |                       |
  |                       v
  |                     +---------------+     +-----------------+
  |                     |      S1       |     | indirect call 2 |
  +-------------------> |  type test 2  | --> |                 |
                        +---------------+     +-----------------+
                          |                     |
                          |                     |
                          v                     |
                        +---------------+       |
                        | direct call 2 |       |
                        +---------------+       |
                          |                     |
                          |                     |
                          v                     |
                        +---------------+       |
                        |      S2       | <-----+
                        +---------------+
+-----------------+     +---------------+
| indirect call 1 |     |      S0       |
|       S1        | <-- |  type test 1  |
+-----------------+     +---------------+
  |                       |
  |                       |
  v                       v
+-----------------+     +---------------+
| indirect call 2 |     | direct call 1 |
|                 | <+  |      S1       |
+-----------------+  |  +---------------+
  |                  |    |
  |                  |    |
  |                  |    v
  |                  |  +---------------+
  |                  +- |  type test 2  |
  |                     +---------------+
  |                       |
  |                       |
  |                       v
  |                     +---------------+
  |                     | direct call 2 |
  |                     +---------------+
  |                       |
  |                       |
  |                       v
  |                     +---------------+
  +-------------------> |      S2       |
                        +---------------+

@BruceForstall
Copy link
Member

The dot-to-ascii idea is cool, but neither display you show are particularly enlightening, compared to your original. I wonder if we should accept markdown + PNG "sidecar" documentation in the repo, with "links" from the source code (maybe in a doc/ subdir)?

@AndyAyersMS
Copy link
Member Author

There's also a design note on guarded devirtualization, so perhaps I should update that and refer to it from the code.

@AndyAyersMS AndyAyersMS merged commit a4d5889 into dotnet:main Apr 28, 2021
@AndyAyersMS AndyAyersMS deleted the MoreAggressiveTailDupWithPGO branch April 28, 2021 00:45
@karelz karelz added this to the 6.0.0 milestone May 20, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Jun 19, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants