introduced a cache for `followAllReferences()` calls with default parameters #7192

firewave · 2025-01-07T20:24:06Z

No description provided.

firewave · 2025-01-07T20:28:21Z

This essentially eliminates any meaningful impact by followAllReferences() at all.

-D__GNUC__ --check-level=exhaustive ../lib/utils.cpp

Clang 19 652,987,030 -> ~~624,476,510~~ 618,089,977

followAllReferences() calls from isAliasOf() - 350,100 -> 1,581

The example from https://trac.cppcheck.net/ticket/10765#comment:4:

Clang 19 3,056,382,003 -> ~~2,838,708,731~~ 2,815,165,117

followAllReferences() calls from isAliasOf() - 2,592,565 -> 641

firewave · 2025-01-07T20:33:25Z

lib/valueflow.cpp

@@ -3738,7 +3738,7 @@ static void valueFlowForwardConst(Token* start,
        } else {
            [&] {
                // Follow references
-                auto refs = followAllReferences(tok);
+                auto refs = tok->refs();


needs to be const auto&.

Done.

I filed https://trac.cppcheck.net/ticket/13533 about detecting this.

firewave · 2025-01-07T20:34:32Z

lib/vf_analyzers.cpp

@@ -627,7 +627,7 @@ struct ValueFlowAnalyzer : Analyzer {
        if (invalid())
            return Action::Invalid;
        // Follow references
-        auto refs = followAllReferences(tok);
+        auto refs = tok->refs();


This copy is necessary since an additional entry is being added. But I think this is not necessary and I will try to refactor the code to avoid it.

I adjusted this by unfortunately there is some redundant code introduced.

pfultz2 · 2025-01-07T20:40:42Z

lib/token.cpp

+    if (!mImpl->mRefs)
+        mImpl->mRefs = new SmallVector<ReferenceToken>(followAllReferences(this));
+    return *mImpl->mRefs;
+}


This is not the right way to do this. This is a const method that is modifying the token. Instead followAllReferences should be moved to the SymbolDatabase and there should be a pass that fills this in for all of the tokens.

Yes. I was about to add a comment about this. This violates the technical const and if we would not allow this (I hope some day I will finish up that change) this would require mutable (which from my experience is acceptable for caches inside objects).

I am not sure how easy it would be to implement an earlier pass since it is not done for all tokens but there are lots of checks which are performed before we actually end up following references. That would need to be replicated I reckon - and that also has a certain visible overhead and we would need to run through that twice then.

Actually I would also have the ValueFlow behave this way so we might avoid running it for code which is not relevant.

This is not the right way to do this. This is a const method that is modifying the token.

That should be totally fine (by precedent). We modify const Token objects all over the place in the ValueFlow and symbol database via const_cast. Obviously it would be better if we didn't but here it is much cleaner and in a single place and as stated before I think this is acceptable practice.

Actually I would also have the ValueFlow behave this way so we might avoid running it for code which is not relevant.

Please disregard this. This is wishful thinking as this would not be possible the way the ValueFlow is working. I totally forgot I already looked into this.

The const_cast should be fixed, but we shouldn't add more code that needs to be fixed.

Also this is called in ValueFlowForward and ValueFlowReverse so its already called on almost every token in functionScopes, so it really won't help performance being a cache.

Furthermore, in copcheck we update the tokens through passes rather than using a cache, this makes it easier to debug and we can provide this information to addons later on. So doing a pass in SymbolDatabase would be consistent with the rest of the code.

Furthermore, in copcheck we update the tokens through passes rather than using a cache, this makes it easier to debug and we can provide this information to addons later on. So doing a pass in SymbolDatabase would be consistent with the rest of the code.

Will give it a try and check how it impacts performance.

It will be problematic because we have that --check-level stuff. A lot of these calls will not be performed if that is not exhaustive. So if it were a pass we would perform unnecessary calls and lose performance. And putting that behind a flag so we don't do that they need to be performed on-demand again. And then we have duplicates again and need the cache again.

So the current approach seems like the best approach.

Do we know if it causes a perf impact or how much? It seems we are making it worse for premature optimizations.

There are other advantages to doing it the correct way too such as better debugging and addons can take advantage of this information (this seems like a useful analysis for addons). So if we enable it for addons then we will beed to run a pass regardless.

Also you could consider skipping this for functions we are skipping analysis for, if the performance is too bad, but it would be good to see some actual numbers to make this decision.

So the current approach seems like the best approach.

I meant to say "It seems like the currently best approach".

Do we know if it causes a perf impact or how much? It seems we are making it worse for premature optimizations.

Various performance numbers are in the PR. It is a massive improvement. It would also help with the runtime of the CI.

Also you could consider skipping this for functions we are skipping analysis for, if the performance is too bad, but it would be good to see some actual numbers to make this decision.

That was an idea regarding the ValueFlow (see https://trac.cppcheck.net/ticket/12528) but that won't work since not all passes are based on function scopes. But that is currently out-of-scope and is something I am looking at within another context hopefully soon.

It might actually not an issue after all because with the duplicated calls eliminated it basically no longer has any footprint. The only issue could be that we perform it for more tokens than we actually need so that would introduce new overhead but it might not be much. Will test that. Although I would prefer not to have that at all since all the overhead adds up - a lot.

I just realized this is called when setting exprids, so it always called on every token regardless of ValueFlow analysis.

pfultz2 · 2025-01-07T20:43:40Z

lib/token.h

@@ -143,6 +145,8 @@ struct TokenImpl {
    void setCppcheckAttribute(CppcheckAttributes::Type type, MathLib::bigint value);
    bool getCppcheckAttribute(CppcheckAttributes::Type type, MathLib::bigint &value) const;

+    SmallVector<ReferenceToken>* mRefs{};


If this is going to be a pointer, you should use std::unique_ptr or std::shared_ptr.

I modeled it after mValues which is also just a raw pointer.

firewave · 2025-01-07T20:59:27Z

Something really weird is going on here in the UBSAN job:

Check time: cli/threadexecutor.cpp: 0.53017s
Check time: cli/processexecutor.cpp: 1.41327s
Check time: lib/addoninfo.cpp: 0.172107s
Check time: lib/analyzerinfo.cpp: 0.636273s

The timing information for cli/cmdlineparser.cpp is missing ...

firewave · 2025-01-07T21:20:25Z

Before

Check time: cli/cmdlineparser.cpp: 1091.73s
[...]
Check time: lib/checkio.cpp: 219.069s
[...]
Check time: lib/symboldatabase.cpp: 191.785s
[...]
Check time: lib/tokenize.cpp: 290.026s

After

Check time: cli/cmdlineparser.cpp: 760.299s
[...]
Check time: lib/checkio.cpp: 168.103s
[...]
Check time: lib/symboldatabase.cpp: 145.913s
[...]
Check time: lib/tokenize.cpp: 236.561s

…ameters

…ze()`

firewave commented Jan 7, 2025

View reviewed changes

pfultz2 reviewed Jan 7, 2025

View reviewed changes

firewave force-pushed the followref-cache branch from 42d1ec9 to b38e8ba Compare January 8, 2025 13:04

firewave marked this pull request as ready for review January 8, 2025 13:38

firewave added 2 commits January 14, 2025 16:38

introduced a cache for followAllReferences() calls with default par…

d3e35a0

…ameters

vf_analyzers.cpp: avoid unnecessary copy in `ValueFlowAnaylzer::analy…

a36380e

…ze()`

firewave force-pushed the followref-cache branch from b38e8ba to a36380e Compare January 14, 2025 15:38

firewave marked this pull request as draft January 19, 2025 16:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

introduced a cache for `followAllReferences()` calls with default parameters #7192

introduced a cache for `followAllReferences()` calls with default parameters #7192

firewave commented Jan 7, 2025

firewave commented Jan 7, 2025 •

edited

Loading

firewave Jan 7, 2025

firewave Jan 8, 2025

firewave Jan 7, 2025

firewave Jan 8, 2025

pfultz2 Jan 7, 2025

firewave Jan 7, 2025

firewave Jan 8, 2025

pfultz2 Jan 16, 2025

firewave Jan 16, 2025

firewave Jan 16, 2025

pfultz2 Jan 17, 2025

This comment was marked as duplicate.

firewave Jan 17, 2025

pfultz2 Jan 17, 2025

pfultz2 Jan 7, 2025

firewave Jan 7, 2025

firewave commented Jan 7, 2025

firewave commented Jan 7, 2025 •

edited

Loading

introduced a cache for followAllReferences() calls with default parameters #7192

Are you sure you want to change the base?

introduced a cache for followAllReferences() calls with default parameters #7192

Conversation

firewave commented Jan 7, 2025

firewave commented Jan 7, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment was marked as duplicate.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

firewave commented Jan 7, 2025

firewave commented Jan 7, 2025 • edited Loading

introduced a cache for `followAllReferences()` calls with default parameters #7192

introduced a cache for `followAllReferences()` calls with default parameters #7192

firewave commented Jan 7, 2025 •

edited

Loading

firewave commented Jan 7, 2025 •

edited

Loading