Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve inference for paths leading to similar #37163

Merged
merged 10 commits into from
Sep 1, 2020
Merged

Conversation

timholy
Copy link
Member

@timholy timholy commented Aug 23, 2020

For many packages that extend array functionality, similar now serves as the single biggest source of invalidation. These invalidations have been hard to diagnose using my SnoopCompile machinery, because runtime dispatch and a lack of "narrowing" of types lead to an absence of backedges (#36892). Without backedges, it's not easy to discover the callers, which is where you'd have to make changes to improve the quality of inference. This motivated me to implement a new general-purpose tool, findcallers in the MethodAnalysis package, that lets you search the type-inferred code of every MethodInstance for calls of a particular function with particular signature. (If you're already a MethodAnalysis user, note that 0.4.0 is the first release where findcallers is reasonably reliable, and even then it has some significant limitations as described in the docs).

I was able to use this tool to discover that the issue is not with the implementation of similar per se, but that we had many cases of comprehensions, map, broadcasting, and concatenation with poor inferrability. This PR, together with a companion to Pkg that I'll submit shortly, fixes almost all of them. Some of the patterns were a bit surprising, for example drilling down into

using Cthulhu
@descend vcat([1,2,3], 4)

you will discover that you get down into _cat_t(::Val{1},::DataType,::Vector{Int64},::Vararg{Any,N} where N)::Any and you end up calling similar with poor type information. (Conversely, push!(copy([1,2,3]), 4) has no such problems.)

An earlier version of this PR fixed all cases with poor container-type inference---we still had some with poor element-type inference from cases where collect(::Generator{I,F}) had stopped specializing on the function type F, but since similar tends to be specialized on I this isn't really that big of a deal. However, I've at least temporarily abandoned the attempt to fix every last instance because of a single function, maptwice, heavily used by pmap. If we could use mapany there then our inference problems there too would go away, but we have tests that explicitly check for element-type narrowing so this seems to be a no-go. It's a little odd given that Distributed has rampant inference problems to insist on the element narrowing, but perhaps that's essentially a way of compensating for Distributed's challenges in this regard. (And gettting good inference does seem in principle like it could be harder in a distributed context.)

Even having reverted those two mapanys in maptwice, this PR cuts the number of remaining invalidations from loading StaticArrays in half, compared to a local branch that includes all my still-open invalidation PRs. That leaves us with just 24 invalidations from StaticArrays, which is rather nice given that on Julia 1.5 it's more than 2000. Even for CategoricalArrays, similar now (locally, i.e., with all my modifications) only contributes 30 unique invalidations out of more than 2000 total, resolving a concern that @nalimilan expressed in JuliaData/CategoricalArrays.jl#177.

Noteworthy details (particularly directed at reviewers of this PR)

First, this changes field types of a few structs, including Cmd which is exported. I don't think Cmd needs to support multidmensional arrays of Strings, but perhaps I'm mistaken.

Reviewers, also note the extra pickiness about arg types in REPL keymap functions. Since setup_interface converts all user-supplied keymaps into Dict{Any,Any} and none of the other functions are part of REPL's official interface, I don't think this would be considered breaking, but I did have to change some of REPL's tests (some of which directly call internal functions) in order to make them pass. As a reminder, the entire LineEdit module is under @nospecialize, so unless method arguments provide type information then inference uses Any no matter what the argument type actually is.

A final item of note concerns Julia's own build process. Together with my other open PRs, the total number of precompile statements is down to about 1550, a drop of about 400 compared to a couple months ago. This is getting close to the point where we'll actually have to change this build-time assertion. The system image is also the smallest I've seen it in a long time:

tim@diva:~/src/julia-master$ ls -l usr/lib/julia/sys.so 
-rwxr-xr-x 1 tim holy 141802056 Aug 23 08:05 usr/lib/julia/sys.so

(Reference #37088 (comment) for where I was relatively recently.) While my general experience has been that the precompilation phase has gone slightly faster with fewer MethodInstances, this PR (or perhaps more likely, the companion Pkg PR I'll also submit) violates that: now I'm getting

Generating precompile statements... 30/30
Executing precompile statements... 1543/1554
Precompilation complete. Summary:
Total ─────── 141.154365 seconds
Generation ── 118.166039 seconds 83.7141%
Execution ───  22.988326 seconds 16.2859%
    LINK usr/lib/julia/sys.so

which is a big slowdown. I'm guessing it's because inference stops "punting" so much, but I haven't made any attempt to investigate this.

@timholy timholy added the compiler:latency Compiler latency label Aug 23, 2020
timholy added a commit to JuliaDebug/Cthulhu.jl that referenced this pull request Aug 23, 2020
timholy added a commit to JuliaDebug/JuliaInterpreter.jl that referenced this pull request Aug 23, 2020
These became visible when working on JuliaLang/julia#37163.
Most are not in a truly performance-critical path (though of course
FrameCode optimization is not entirely uncritical, either), but one of them
is.
timholy added a commit to JuliaDebug/JuliaInterpreter.jl that referenced this pull request Aug 23, 2020
These became visible when working on JuliaLang/julia#37163.
Most are not in a truly performance-critical path (though of course
FrameCode optimization is not entirely uncritical, either), but one of them
is.
@timholy timholy force-pushed the teh/similar_inference branch from ad72eae to 1aa16e2 Compare August 24, 2020 12:38
@timholy
Copy link
Member Author

timholy commented Aug 24, 2020

While it's basically incidental to the thrust of this PR, I added a @nospecialize to (EDIT: dropped) _str_sizehint, which had 56 MethodInstance specializations. But perhaps that was a bad idea since sizehint is typically a performance optimization. It's just that the way _str_sizehint is written makes it seem like it wants to make decisions at runtime.

I should also ask about Base.print_to_string---it has 222 specializations, which seems like a lot.

base/abstractarray.jl Outdated Show resolved Hide resolved
base/Enums.jl Outdated Show resolved Hide resolved
@KristofferC
Copy link
Member

I tried master vs this for building julia sysimage and with this PR it was 4 seconds faster so I don't see the

which is a big slowdown.

effect, at least.

@@ -781,7 +781,7 @@ function show_backtrace(io::IO, t::Vector)

try invokelatest(update_stackframes_callback[], filtered) catch end
# process_backtrace returns a Vector{Tuple{Frame, Int}}
frames = (first.(filtered))::Vector{StackFrame}
frames = map(x->first(x)::StackFrame, filtered)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
frames = map(x->first(x)::StackFrame, filtered)
frames = StackFrame[first(x) for x in filtered]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually these two are different. My version asserts that first(x) is already a StackFrame; your version will call convert(StackFrame, x::Any)::StackFrame. Given that we know x is already a StackFrame, that makes my version less vulnerable to convert invalidations.

@pabloferz
Copy link
Contributor

pabloferz commented Aug 28, 2020

The only thing that makes me uneasy here is the _str_sizehint. I feel it could have performance impacts, so I'd be inclined to leave that out.

timholy and others added 10 commits September 1, 2020 03:34
Because the tuple-length is unknown and because inference gives up
easily in the face of missing type parameters, the generator expressions
in the previous implementation were poorly inferred.
This also makes mapany safe for iterators without `length`
The main concern here is the specification of types in the REPL code
required changes to the tests. However, since these are testing
internal functions that does not seem to be too serious a concern.
Co-authored-by: Pablo Zubieta <8410335+pabloferz@users.noreply.github.com>
Co-authored-by: Pablo Zubieta <8410335+pabloferz@users.noreply.github.com>
@timholy timholy force-pushed the teh/similar_inference branch from f030463 to 9686cf9 Compare September 1, 2020 08:36
@timholy
Copy link
Member Author

timholy commented Sep 1, 2020

I dropped the _str_sizehint commit.

@timholy timholy merged commit 294e5ae into master Sep 1, 2020
@timholy timholy deleted the teh/similar_inference branch September 1, 2020 12:12
aviatesk added a commit to aviatesk/julia that referenced this pull request Sep 6, 2020
- `LineEdit.keymap` was annotated with concrete argument types to
prevent invalidations in JuliaLang#37163
- but it loses generics of this function, which in turn makes
OhMyREPL.jl errors on its initialization with `NoMethodError` on master
  * especially [this
line](/~https://github.com/KristofferC/OhMyREPL.jl/blob/cc7e467606e1cc13f11f95576d1a12c371cd4214/src/repl.jl#L284) can't be matched to `Union{Vector{AnyDict}, Vector{Dict{Char,Any}}}`
  * no sure why but `LineEdit.keymap(Base.AnyDict[NEW_KEYBINDINGS, main_mode.keymap_dict])` doesn't help the matching; the error seems to say `Base.AnyDict[NEW_KEYBINDINGS, main_mode.keymap_dict]` is somehow
widened to `Vector{Dict}`:
  ```
  MethodError: no method matching keymap(::Vector{Dict})
Closest candidates are:
  keymap(::Any, ::Union{REPL.LineEdit.HistoryPrompt,
REPL.LineEdit.PrefixHistoryPrompt}) at
/Users/aviatesk/julia/julia/usr/share/julia/stdlib/v1.6/REPL/src/LineEdit.jl:2007
  keymap(::Union{Vector{Dict{Any, Any}}, Vector{Dict{Char, Any}}}) at
/Users/aviatesk/julia/julia/usr/share/julia/stdlib/v1.6/REPL/src/LineEdit.jl:1610
  keymap(::REPL.LineEdit.PromptState, ::REPL.LineEdit.Prompt) at
/Users/aviatesk/julia/julia/usr/share/julia/stdlib/v1.6/REPL/src/LineEdit.jl:2505
  ...
  ```
- I think it's better to keep the generics anyway, so this PR widens
function while keeping specialization by annotating `at-specialize`
jmert added a commit to jmert/BitFlags.jl that referenced this pull request Sep 10, 2020
Keno pushed a commit that referenced this pull request Jun 5, 2024
* Infer better eachindex broadcasting

* Fix a misuse of show_datatype

* Improve inference in vcat(A::BitMatrix...)

Because the tuple-length is unknown and because inference gives up
easily in the face of missing type parameters, the generator expressions
in the previous implementation were poorly inferred.

* Use Vector{String}  in Cmd field type

* Introduce ntupleany and use mapany in more places

This also makes mapany safe for iterators without `length`

* Add types to some comprehensions and lists

* Add some type-asserts and argtypes

* AbstractString->String in Distributed.ProcessGroup

* Update base/Enums.jl

* Update base/abstractarray.jl

Co-authored-by: Pablo Zubieta <8410335+pabloferz@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:latency Compiler latency
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants