Add more iterators to JIT #52515

BruceForstall · 2021-05-09T06:37:53Z

Add more iterators compatible with range-based for syntax for various data structures. These iterators all assume (and some check) that the underlying data structures determining the iteration are not changed during the iteration. For example, don't use these to iterator over the block list if you are also modifying the block list. Similarly, don't use these to iterate over the predecessor edges if you are changing the order or contents of the predecessor edge list.

BasicBlock: iterate over all blocks in the function, a subset starting not at the first block, or a specified range of blocks. Removed uses of the foreach_block macro. E.g.:

for (BasicBlock* const block : Blocks()) // all blocks in function
for (BasicBlock* const block : BasicBlockSimpleList(fgFirstBB->bbNext)) // all blocks starting at fgFirstBB->bbNext
for (BasicBlock* const testBlock : BasicBlockRangeList(firstNonLoopBlock, lastNonLoopBlock)) // all blocks in range (inclusive)

block predecessors: iterate over all predecessor edges, or all predecessor blocks, e.g.:

for (flowList* const edge : block->PredEdges())
for (BasicBlock* const predBlock : block->PredBlocks())

block successors: iterate over all block successors using the NumSucc()/GetSucc(), or NumSucc(Compiler*)/GetSucc(Compiler*) pairs, e.g.:

for (BasicBlock* const succ : Succs())
for (BasicBlock* const succ : Succs(compiler))

Note that there already exists the "AllSuccessorsIter" which iterates over block successors including possible EH successors, e.g.:

for (BasicBlock* succ : block->GetAllSuccs(m_pCompiler))

switch targets, (namely, the successors of BBJ_SWITCH blocks), e.g.:

for (BasicBlock* const bTarget : block->SwitchTargets())

loops blocks: iterate over all the blocks in a loop, e.g.:

for (BasicBlock* const blk : optLoopTable[loopInd].LoopBlocks())

Statements: added an iterator shortcut for the non-phi statements, e.g.:

for (Statement* const stmt : block->NonPhiStatements())

Note that there already exists an iterator over all statements, e.g.:

for (Statement* const stmt : block->Statements())

EH clauses, e.g.:

for (EHblkDsc* const HBtab : EHClauses(this))

GenTree in linear order (but not LIR, which already has an iterator), namely, using the gtNext links, e.g.:

for (GenTree* const call : stmt->TreeList())

This is a no-diff change.

BruceForstall · 2021-05-12T17:04:53Z

/azp run runtime-coreclr jitstress

azure-pipelines · 2021-05-12T17:05:10Z

Azure Pipelines successfully started running 1 pipeline(s).

BruceForstall · 2021-05-12T18:20:23Z

Here are PIN throughput measurements over a superpmi replay of the benchmarks collection:

Collection: benchmarks
	Debug	Checked	Release	Release (no PGO)	Debug/Checked	Debug/Release	Checked/Release	Rel-no-PGO/Rel
Baseline	531,491,152,090	168,090,820,492	32,748,694,066	36,500,757,131
	531,492,080,507	168,073,928,845	32,747,953,699	36,516,286,268
	531,466,358,038	168,085,854,512	32,734,810,088	36,514,926,941
Average:	531,483,196,878	168,083,534,616	32,743,819,284	36,510,656,780	3.162018208	16.23155785	5.133290444	1.11503965
(max - min) / min	4.83991E-05	0.000100501	0.000424135	0.000425447

Diff	544,010,189,065	168,089,266,473	33,131,445,166	36,515,701,230
	544,021,549,738	168,110,984,589	33,138,743,064	36,528,148,318
	544,023,088,633	168,114,904,944	33,132,977,072	36,509,253,833
Average:	544,018,275,812	168,105,052,002	33,134,388,434	36,517,701,127	3.236180408	16.41853982	5.07343156	1.102108802
(max - min) / min	2.3712E-05	0.000152529	0.000220271	0.000517526

Diff/Base (Average)	1.02358509	1.000128016	1.011928027	1.000192939

There is a "normal" Release build and "no-PGO" Release build, which is ~11% slower, but the "diff" case doesn't suffer from out-of-date PGO training data.

There is about a %0.02 TP regression in Release. There is also a 2% regression in Debug, which is more expected, as function calls are done for the iteration ++ and != operations.

BruceForstall · 2021-05-12T18:21:21Z

@dotnet/jit-contrib Comments? Ideas? Suggestions? Should this be pursued? Do you like how the code looks? Does it seem like a useful abstraction improvement?

src/coreclr/jit/compiler.cpp

EgorBo · 2021-05-12T18:40:01Z

As for the regressions I wonder if it should be exactly the same on clang/gcc
E.g. we recently discovered that MSVC doesn't unroll loops with iterators https://godbolt.org/z/enhcafn88

AndyAyersMS · 2021-05-12T18:47:15Z

For the release regression measurement, were you using non-PGO optimized builds?

EgorBo · 2021-05-12T18:48:40Z

src/coreclr/jit/fgdiagnostic.cpp

@@ -2194,7 +2185,7 @@ void Compiler::fgDumpTrees(BasicBlock* firstBlock, BasicBlock* lastBlock)

    // Note that typically we have already called fgDispBasicBlocks()
    // so we don't need to print the preds and succs again here.
-    for (BasicBlock* block = firstBlock; block; block = block->bbNext)
+    for (BasicBlock* block = firstBlock; block != nullptr; block = block->bbNext)


Shouldn't this be for (BasicBlock* const block : BasicBlockSimpleList(firstBlock)) ?

Yes, that would work. I had converted the whole thing to:

for (BasicBlock* const block : BasicBlockRangeList(firstBlock, lastBlock)) { fgDumpBlock(block); }

but some callers pass in a nullptr for lastBlock (and maybe firstBlock?) and I didn't track down if that's ok.

BruceForstall · 2021-05-12T20:00:14Z

For the release regression measurement, were you using non-PGO optimized builds?

Good point. This was for default Release builds. I need to figure out the current method for disabling PGO (any suggestions?).

sandreenko · 2021-05-12T20:02:50Z

Good point. This was for default Release builds. I need to figure out the current method for disabling PGO (any suggestions?).

build %SUBSET% -rc %MODE% -lc release -arch %ARCH% /p:NoPgoOptimize=true

BruceForstall · 2021-05-12T20:22:36Z

I updated the throughput table above with a Release-no-PGO run, and it shows only a 0.02% TP regression. The noise level appears to be about 0.05%, so this is in the noise.

AndyAyersMS · 2021-05-12T21:55:29Z

it shows only a 0.02% TP regression

Great! I would have been somewhat surprised to find this sort of thing still causes problems for C++ compilers when optimizing.

AndyAyersMS · 2021-05-12T22:02:30Z

This is a nice improvement to the code. I assume this is no-diff?

@chuck-mitchell would also approve.

BruceForstall · 2021-05-13T00:35:31Z

/azp run runtime-coreclr jitstress

azure-pipelines · 2021-05-13T00:35:49Z

Azure Pipelines successfully started running 1 pipeline(s).

BruceForstall · 2021-05-13T00:36:09Z

I assume this is no-diff?

Correct. I ran x86 and x64 SPMI asm diffs.

BruceForstall · 2021-05-13T00:36:46Z

@dotnet/jit-contrib This is ready for review

AndyAyersMS

Overall this looks great.

I'm still concerned about unexpected behavior if a collection is modified. Not sure if it's possible or worth keeping extra state in the debug builds iterators to try and catch when this happens.

Am also tempted by the idea that the block iterators could be expressed by overloads, eg

Blocks()
Blocks(BasicBlock* fromBlock)
Blocks(BasicBlock* fromBlock, BasicBlock* toBlock)

BruceForstall · 2021-05-13T17:09:50Z

I'm still concerned about unexpected behavior if a collection is modified. Not sure if it's possible or worth keeping extra state in the debug builds iterators to try and catch when this happens.

What kind of checking do you think might be useful? One idea I had is caching the "next" pointer, then verifying it when we go to ++. For something like BasicBlock, we could assert that this->bbNext->bbPrev == this, to check that the block hasn't been spliced out, for example.

Am also tempted by the idea that the block iterators could be expressed by overloads

I like that; I'll make the change.

AndyAyersMS · 2021-05-13T18:21:54Z

What kind of checking do you think might be useful? One idea I had is caching the "next" pointer...

That would be a good start -- probably enough to catch the most obvious failure pattern.

BruceForstall · 2021-05-13T21:50:40Z

It turns out that the EfficientEdgeCountInstrumentor, when inserting counts for critical edges, can create new basic blocks. So, asserting that the bbNext block is what we expected it to be when we first iterated to a block fails: the bbNext has changed.

One option is to back off and use to old-style "manual" iteration for anything that changes to block list (as an example; some kind of checking might make sense for the other iterators). But I wonder if we should just be ok with changes to the flow graph. If the iterator pre-calculated next, it would be a problem. Maybe it's ok to assume bbNext will only be consulted when ++ is used.

There was another case for a forward walk was used in fgUnlinkRange, and that didn't work with my assert that block->bbNext->bbPrev == block, since the bbPrev links outside of the range had been fixed up, and the range was the spliced-out blocks. Once again, it works fine without the asserts, but perhaps the asserts will actually catch something useful. Not sure what the best balance is.

Comments?

BruceForstall · 2021-06-06T01:15:41Z

Here's an updated PIN over superpmi replay of benchmarks collection for the current code:

Collection: benchmarks
	Debug	Checked	Release (no PGO)	Debug/Checked	Debug/Release	Checked/Release
Baseline	524,976,811,958	166,458,195,773	36,624,759,188
	525,029,406,347	166,472,846,364	36,617,857,951
	524,982,131,581	166,455,678,324	36,626,907,592
Average:	524,996,116,629	166,462,240,154	36,623,174,910	3.153845077	14.3350793	4.545270599
(max - min) / min	0.000100184	0.000103139	0.000247137

Diff	542,820,582,754	166,413,960,622	36,633,871,905
	542,871,277,889	166,380,588,108	36,630,646,502
	542,837,533,343	166,380,383,253	36,684,350,538
Average:	542,843,131,329	166,391,643,994	36,649,622,982	3.262442262	14.81169756	4.540064275
(max - min) / min	9.33921E-05	0.000201811	0.001466096

Diff/Base (Average)	1.033994565	0.999575903	1.000722168

So the Release build is 0.01% slower, which appears to be in the noise. The Debug build is 3% slower, which is expected.

1. Make more BasicBlock members `const`.

`for (Statement* stmt : blk->NonPhiStatements())`

This matches the style used for block and GenTree iterators, and prevents errors where changing the stmt pointer expects the iterator location to change.

With this change, all block iteration can start from Compiler::Block(): ``` for (BasicBlock* const block : compiler->Blocks()) ... for (BasicBlock* const block : compiler->Blocks(startBlock)) ... for (BasicBlock* const block : compiler->Blocks(startBlock, endBlock)) ... ``` The second two could be `static`, but I haven't done that.

E.g.: ``` for (BasicBlock* const target : block->SwitchTargets()) ``` Also, (1) reverted one LSRA block iterator, (2) tweaked LSRA block ordering dump.

e.g., for (BasicBlock* const succ : block->Succs()) This is equivalent to looping using NumSucc() and GetSucc() using the versions which don't take a `Compiler*`.

If fall-through and branch destinations are identical, only iterate over one.

For simplicity and consistency.

BruceForstall · 2021-06-07T03:56:23Z

/azp run runtime-coreclr jitstress, runtime-coreclr outerloop

azure-pipelines · 2021-06-07T03:56:48Z

Azure Pipelines successfully started running 2 pipeline(s).

BruceForstall · 2021-06-07T03:58:55Z

@dotnet/jit-contrib This is ready for a final review. I've updated the top comment to describe all the added iterators.

AndyAyersMS

Not an easy change to review -- I looked over several of the files I'm more familiar with and it looks good.

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 9, 2021

BruceForstall force-pushed the AddIterators branch from d0de2f3 to e634e12 Compare May 9, 2021 06:43

dotnet deleted a comment from azure-pipelines bot May 9, 2021

BruceForstall force-pushed the AddIterators branch from 1f5bcfd to 87c7f74 Compare May 12, 2021 17:04

dotnet deleted a comment from azure-pipelines bot May 12, 2021

BruceForstall changed the title ~~Add iterators~~ Add more iterators to JIT May 12, 2021

EgorBo reviewed May 12, 2021

View reviewed changes

src/coreclr/jit/compiler.cpp Show resolved Hide resolved

EgorBo reviewed May 12, 2021

View reviewed changes

BruceForstall marked this pull request as ready for review May 13, 2021 00:36

AndyAyersMS approved these changes May 13, 2021

View reviewed changes

EgorBo approved these changes May 13, 2021

View reviewed changes

BruceForstall added 20 commits June 6, 2021 20:43

Add more JIT data structure iterators

691fb23

1. Make more BasicBlock members `const`.

Convert all non-phi statement iteration to the form:

259d8b8

`for (Statement* stmt : blk->NonPhiStatements())`

BasicBlock iterator

29b3002

Add iterators for predecessor edges and predecessor blocks

0c6f04c

Add iterator for blocks in a loop

df142e5

Add more block iterators for less trivial cases

7c2d365

Fix 2 issues with Block iterators; Add EH clause iterator

794bc30

Add GenTree iterator

4fcbc2c

Mark Statement pointer as const in iterators

dabf463

This matches the style used for block and GenTree iterators, and prevents errors where changing the stmt pointer expects the iterator location to change.

Formatting

f03d4b3

Add comments for the iterator classes and functions

99d7d5b

Add iterator for switch targets

75711d1

E.g.: ``` for (BasicBlock* const target : block->SwitchTargets()) ``` Also, (1) reverted one LSRA block iterator, (2) tweaked LSRA block ordering dump.

Fix Linux build

57d9fa4

Add iterator for block successors

91108db

e.g., for (BasicBlock* const succ : block->Succs()) This is equivalent to looping using NumSucc() and GetSucc() using the versions which don't take a `Compiler*`.

Fix BBJ_COND block iteration

33f0706

If fall-through and branch destinations are identical, only iterate over one.

Add iterator for NumSucc/GetSucc taking Compiler*

a3f3d40

Move single-use iterator implementations to be nested classes

ea48595

For simplicity and consistency.

Fix merge issue

729a34c

Add iterators, check that collections aren't being modified

828bf32

BruceForstall force-pushed the AddIterators branch from ee5a57f to 828bf32 Compare June 7, 2021 03:45

runfoapp bot mentioned this pull request Jun 7, 2021

Compiler crashes when failing to release Mutex #53420

Open

AndyAyersMS approved these changes Jun 7, 2021

View reviewed changes

BruceForstall merged commit 5788ea0 into dotnet:main Jun 7, 2021

BruceForstall deleted the AddIterators branch June 7, 2021 23:16

ghost locked as resolved and limited conversation to collaborators Jul 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add more iterators to JIT #52515

Add more iterators to JIT #52515

BruceForstall commented May 9, 2021 •

edited

Loading

BruceForstall commented May 12, 2021

azure-pipelines bot commented May 12, 2021

BruceForstall commented May 12, 2021 •

edited

Loading

BruceForstall commented May 12, 2021

EgorBo commented May 12, 2021

AndyAyersMS commented May 12, 2021

EgorBo May 12, 2021

BruceForstall May 12, 2021

BruceForstall commented May 12, 2021

sandreenko commented May 12, 2021 •

edited

Loading

BruceForstall commented May 12, 2021

AndyAyersMS commented May 12, 2021

AndyAyersMS commented May 12, 2021

BruceForstall commented May 13, 2021

azure-pipelines bot commented May 13, 2021

BruceForstall commented May 13, 2021

BruceForstall commented May 13, 2021

AndyAyersMS left a comment

BruceForstall commented May 13, 2021

AndyAyersMS commented May 13, 2021

BruceForstall commented May 13, 2021

BruceForstall commented Jun 6, 2021

BruceForstall commented Jun 7, 2021

azure-pipelines bot commented Jun 7, 2021

BruceForstall commented Jun 7, 2021

AndyAyersMS left a comment

Add more iterators to JIT #52515

Add more iterators to JIT #52515

Conversation

BruceForstall commented May 9, 2021 • edited Loading

BruceForstall commented May 12, 2021

azure-pipelines bot commented May 12, 2021

BruceForstall commented May 12, 2021 • edited Loading

BruceForstall commented May 12, 2021

EgorBo commented May 12, 2021

AndyAyersMS commented May 12, 2021

EgorBo May 12, 2021

Choose a reason for hiding this comment

BruceForstall May 12, 2021

Choose a reason for hiding this comment

BruceForstall commented May 12, 2021

sandreenko commented May 12, 2021 • edited Loading

BruceForstall commented May 12, 2021

AndyAyersMS commented May 12, 2021

AndyAyersMS commented May 12, 2021

BruceForstall commented May 13, 2021

azure-pipelines bot commented May 13, 2021

BruceForstall commented May 13, 2021

BruceForstall commented May 13, 2021

AndyAyersMS left a comment

Choose a reason for hiding this comment

BruceForstall commented May 13, 2021

AndyAyersMS commented May 13, 2021

BruceForstall commented May 13, 2021

BruceForstall commented Jun 6, 2021

BruceForstall commented Jun 7, 2021

azure-pipelines bot commented Jun 7, 2021

BruceForstall commented Jun 7, 2021

AndyAyersMS left a comment

Choose a reason for hiding this comment

BruceForstall commented May 9, 2021 •

edited

Loading

BruceForstall commented May 12, 2021 •

edited

Loading

sandreenko commented May 12, 2021 •

edited

Loading