-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more iterators to JIT #52515
Add more iterators to JIT #52515
Conversation
d0de2f3
to
e634e12
Compare
1f5bcfd
to
87c7f74
Compare
/azp run runtime-coreclr jitstress |
Azure Pipelines successfully started running 1 pipeline(s). |
Here are PIN throughput measurements over a superpmi replay of the benchmarks collection:
There is a "normal" Release build and "no-PGO" Release build, which is ~11% slower, but the "diff" case doesn't suffer from out-of-date PGO training data. There is about a %0.02 TP regression in Release. There is also a 2% regression in Debug, which is more expected, as function calls are done for the iteration |
@dotnet/jit-contrib Comments? Ideas? Suggestions? Should this be pursued? Do you like how the code looks? Does it seem like a useful abstraction improvement? |
As for the regressions I wonder if it should be exactly the same on clang/gcc |
For the release regression measurement, were you using non-PGO optimized builds? |
@@ -2194,7 +2185,7 @@ void Compiler::fgDumpTrees(BasicBlock* firstBlock, BasicBlock* lastBlock) | |||
|
|||
// Note that typically we have already called fgDispBasicBlocks() | |||
// so we don't need to print the preds and succs again here. | |||
for (BasicBlock* block = firstBlock; block; block = block->bbNext) | |||
for (BasicBlock* block = firstBlock; block != nullptr; block = block->bbNext) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be for (BasicBlock* const block : BasicBlockSimpleList(firstBlock))
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that would work. I had converted the whole thing to:
for (BasicBlock* const block : BasicBlockRangeList(firstBlock, lastBlock))
{
fgDumpBlock(block);
}
but some callers pass in a nullptr for lastBlock
(and maybe firstBlock
?) and I didn't track down if that's ok.
Good point. This was for default Release builds. I need to figure out the current method for disabling PGO (any suggestions?). |
|
I updated the throughput table above with a Release-no-PGO run, and it shows only a 0.02% TP regression. The noise level appears to be about 0.05%, so this is in the noise. |
Great! I would have been somewhat surprised to find this sort of thing still causes problems for C++ compilers when optimizing. |
This is a nice improvement to the code. I assume this is no-diff? @chuck-mitchell would also approve. |
/azp run runtime-coreclr jitstress |
Azure Pipelines successfully started running 1 pipeline(s). |
Correct. I ran x86 and x64 SPMI asm diffs. |
@dotnet/jit-contrib This is ready for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks great.
I'm still concerned about unexpected behavior if a collection is modified. Not sure if it's possible or worth keeping extra state in the debug builds iterators to try and catch when this happens.
Am also tempted by the idea that the block iterators could be expressed by overloads, eg
Blocks()
Blocks(BasicBlock* fromBlock)
Blocks(BasicBlock* fromBlock, BasicBlock* toBlock)
What kind of checking do you think might be useful? One idea I had is caching the "next" pointer, then verifying it when we go to
I like that; I'll make the change. |
That would be a good start -- probably enough to catch the most obvious failure pattern. |
It turns out that the EfficientEdgeCountInstrumentor, when inserting counts for critical edges, can create new basic blocks. So, asserting that the bbNext block is what we expected it to be when we first iterated to a block fails: the bbNext has changed. One option is to back off and use to old-style "manual" iteration for anything that changes to block list (as an example; some kind of checking might make sense for the other iterators). But I wonder if we should just be ok with changes to the flow graph. If the iterator pre-calculated There was another case for a forward walk was used in fgUnlinkRange, and that didn't work with my assert that Comments? |
Here's an updated PIN over superpmi replay of benchmarks collection for the current code:
So the Release build is 0.01% slower, which appears to be in the noise. The Debug build is 3% slower, which is expected. |
1. Make more BasicBlock members `const`.
`for (Statement* stmt : blk->NonPhiStatements())`
This matches the style used for block and GenTree iterators, and prevents errors where changing the stmt pointer expects the iterator location to change.
With this change, all block iteration can start from Compiler::Block(): ``` for (BasicBlock* const block : compiler->Blocks()) ... for (BasicBlock* const block : compiler->Blocks(startBlock)) ... for (BasicBlock* const block : compiler->Blocks(startBlock, endBlock)) ... ``` The second two could be `static`, but I haven't done that.
E.g.: ``` for (BasicBlock* const target : block->SwitchTargets()) ``` Also, (1) reverted one LSRA block iterator, (2) tweaked LSRA block ordering dump.
e.g., for (BasicBlock* const succ : block->Succs()) This is equivalent to looping using NumSucc() and GetSucc() using the versions which don't take a `Compiler*`.
If fall-through and branch destinations are identical, only iterate over one.
For simplicity and consistency.
ee5a57f
to
828bf32
Compare
/azp run runtime-coreclr jitstress, runtime-coreclr outerloop |
Azure Pipelines successfully started running 2 pipeline(s). |
@dotnet/jit-contrib This is ready for a final review. I've updated the top comment to describe all the added iterators. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not an easy change to review -- I looked over several of the files I'm more familiar with and it looks good.
Add more iterators compatible with range-based
for
syntax for various data structures. These iterators all assume (and some check) that the underlying data structures determining the iteration are not changed during the iteration. For example, don't use these to iterator over the block list if you are also modifying the block list. Similarly, don't use these to iterate over the predecessor edges if you are changing the order or contents of the predecessor edge list.foreach_block
macro. E.g.:NumSucc()/GetSucc()
, orNumSucc(Compiler*)/GetSucc(Compiler*)
pairs, e.g.:Note that there already exists the "AllSuccessorsIter" which iterates over block successors including possible EH successors, e.g.:
BBJ_SWITCH
blocks), e.g.:Note that there already exists an iterator over all statements, e.g.:
gtNext
links, e.g.:This is a no-diff change.