Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic optimizations #249

Merged
merged 142 commits into from
Dec 1, 2022
Merged

Add basic optimizations #249

merged 142 commits into from
Dec 1, 2022

Conversation

MarcelGarus
Copy link
Member

@MarcelGarus MarcelGarus commented Nov 11, 2022

Depends on #246

This PR adds optimizations.

The problem

Before, use expressions lead to exponential code blowup: In the Core package, we had a use "equality" import in several files. During runtime, this results in multiple instantiations of the equality module. Especially for taller hierarchies of modules, this can easily lead to an exponential amount of code being executed.

We thought about only instantiating modules the first time they are imported (and only returning a reference the other times), but this could lead to other weird behavior. For example, if a package Foo exports something mutable (for example, something containing a channel), this could happen:

bar = use "Bar"
baz = use "Baz"  # May be different depending on whether Bar is imported.

Essentially, Bar could use "Foo" and send something in the channel in top-level scope, Baz could use "Foo" and receive from that channel to see if Bar was imported before. This contradicts our intuition about how a functional language should behave.

The solution

We could instead perform a handful of optimizations such as common-subtree-elimination statically, before executing code. Take the following code:

# in the "equality" module:
equals := { ... }

# in the "bool" module
equals = (use "equality").equals
and a b := ... # uses the equals function

# in the "int" module
equals = (use "equality").equals
add a b := ... # uses the equals function

# in the "core" module
bool := use "bool"
equals := use "equals"
int := use "int"

Each time a module is first encountered, its imports are inlined and it is optimized. After optimizations, the modules would look like this:

# in the "equality" module:
equals := { ... }

# in the "bool" module
equals := { ... }
and a b := ... # uses the equals function

# in the "int" module
equals := { ... }
add a b := ... # uses the equals function

# in the "core" module
equals := { ... }
and a b := ... # uses the equals function
add a b := ... # uses the equals function

Note that in each optimized module, every constant only exists once. Because most things that libraries define are constants (aka strings, ints, and (most importantly) functions with known captured arguments) as opposed to non-constants such as channels, this dramatically reduces the amount of code to be executed.

The implementation

To achieve this, I introduced a new intermediate representation, the MIR (Mid-Level Intermediate Representation), between the HIR and LIR. This representation is modeled so that it makes writing optimization passes easy. Also, some HIR expressions such as needs and compiler error nodes are desugared. Responsibilities of calls are explicitly tracked.

These are the implemented optimizations:

  • common subtree elimination: When two expressions are equal, replace the second one with a reference to the first.
  • constant folding: Execute some builtins at compile-time.
  • constant lifting: Lift constants out of closures and into the surrounding scope.
  • inlining: Inline functions. For now, this is only done for functions containing use expressions.
  • module folding: Constant-evaluate use expressions with known targets by inlining the optimized MIR of the module.
  • multiple flattening: Flatten nested expressions (more of an implementation details).
  • reference following: Replace usages of references with the original ID.
  • tree shaking: Remove unused pure expressions.

Take this code example:

core = use "..Core"

main := { environment ->
  print message =
    needs (core.text.is message)
    core.channel.send environment.stdout message

  print "Hello, world!"
}

It results in the following MIR (comments added by me):

# Referenced HIR IDs
$0 = HirId(anonymous:$generated::needs)
$1 = HirId(user:"/home/marcel/projects/candy":packages/Benchmark:80:2)
$2 = HirId(user:"/home/marcel/projects/candy":packages/Benchmark:80:print:10)
$3 = HirId(user:"/home/marcel/projects/candy":packages/Benchmark:80:print:21)
$4 = HirId(user:"/home/marcel/projects/candy":packages/Benchmark:80:print:23)
$5 = HirId(user:"/home/marcel/projects/candy":packages/Benchmark:80:print:8)
$6 = HirId(user:"/home/marcel/projects/candy":packages/Core/bool:is:10)
$7 = HirId(user:"/home/marcel/projects/candy":packages/Core/bool:is:7)
$8 = HirId(user:"/home/marcel/projects/candy":packages/Core/bool:is:9:3)
$9 = HirId(user:"/home/marcel/projects/candy":packages/Core/bool:not:12)
$10 = HirId(user:"/home/marcel/projects/candy":packages/Core/bool:not:2)
$11 = HirId(user:"/home/marcel/projects/candy":packages/Core/bool:not:4)
$12 = HirId(user:"/home/marcel/projects/candy":packages/Core/channel:isReceivePort:3)
$13 = HirId(user:"/home/marcel/projects/candy":packages/Core/channel:isSendPort:3)
$14 = HirId(user:"/home/marcel/projects/candy":packages/Core/channel:send:12)
$15 = HirId(user:"/home/marcel/projects/candy":packages/Core/channel:send:14)
$16 = HirId(user:"/home/marcel/projects/candy":packages/Core/channel:send:21)
$17 = HirId(user:"/home/marcel/projects/candy":packages/Core/channel:send:6)
$18 = HirId(user:"/home/marcel/projects/candy":packages/Core/channel:send:7)
$19 = HirId(user:"/home/marcel/projects/candy":packages/Core/channel:send:9)
$20 = HirId(user:"/home/marcel/projects/candy":packages/Core/equality:equals:6)
$21 = HirId(user:"/home/marcel/projects/candy":packages/Core/text:is:6)
$22 = HirId(user:"/home/marcel/projects/candy":packages/Core/type:is:3)
$23 = HirId(user:"/home/marcel/projects/candy":packages/Core/type:is:5)
$24 = HirId(user:"/home/marcel/projects/candy":packages/Core/type:typeOf:5)

# Builtins
$25 = builtinChannelSend
$26 = builtinEquals
$27 = builtinIfElse
$28 = builtinStructGet
$29 = builtinTypeOf
$30 = False
$31 = Main
$32 = Nothing
$33 = ReceivePort
$34 = SendPort
$35 = Stdout
$36 = Text
$37 = True

# Strings
$38 = "Hello, world!"
$39 = "The `condition` must be either `True` or `False`."
$40 = "The `reason` must be a text."
$41 = "`core.text.is message` was not satisfied"
$42 = "`is a` was not satisfied"
$43 = "`isSendPort port` was not satisfied"
$44 = "`port` should be a send port, not a receive port."

# Constant lambdas

# Lambda returning nothing
$45 = {  (+ responsible $46) ->
  $47 = $32
}
# Lambda returning True
$48 = {  (+ responsible $49) ->
  $50 = $37
}
# needs
$51 = { $52 $53 $54 (+ responsible $55) ->
  $56 = call $26 with $52 $37 ($0 is responsible)
  $57 = call $26 with $52 $30 ($0 is responsible)
  $58 = {  (+ responsible $59) ->
    $60 = $57
  }
  $61 = call $27 with $56 $48 $58 ($0 is responsible)
  $62 = {  (+ responsible $63) ->
    $64 = panicking because $39 ($55 is at fault)
  }
  $65 = call $27 with $61 $45 $62 ($0 is responsible)
  $66 = call $29 with $53 ($55 is responsible)
  $67 = call $26 with $66 $36 ($55 is responsible)
  $68 = {  (+ responsible $69) ->
    $70 = panicking because $40 ($55 is at fault)
  }
  $71 = call $27 with $67 $45 $68 ($0 is responsible)
  $72 = {  (+ responsible $73) ->
    $74 = panicking because $53 ($54 is at fault)
  }
  $75 = call $27 with $52 $45 $72 ($0 is responsible)
}
$76 = {  (+ responsible $77) ->
  $78 = $30
}
$79 = { $80 $81 (+ responsible $82) ->
  $83 = call $26 with $80 $81 ($20 is responsible)
  $84 = $83
}
$85 = { $86 (+ responsible $87) ->
  $88 = call $29 with $86 ($24 is responsible)
  $89 = $88
}
$90 = { $91 $92 (+ responsible $93) ->
  $94 = call $85 with $91 ($22 is responsible)
  $95 = call $79 with $94 $92 ($23 is responsible)
  $96 = $95
}
$97 = { $98 (+ responsible $99) ->
  $100 = call $79 with $98 $37 ($7 is responsible)
  $101 = {  (+ responsible $102) ->
    $103 = call $79 with $98 $30 ($8 is responsible)
    $104 = $103
  }
  $105 = call $27 with $100 $48 $101 ($6 is responsible)
  $106 = $105
}
$107 = { $108 (+ responsible $109) ->
  $110 = call $97 with $108 ($10 is responsible)
  $111 = call $51 with $110 $42 $109 ($11 is responsible)
  $112 = call $27 with $108 $76 $48 ($9 is responsible)
  $113 = $112
}
$114 = { $115 (+ responsible $116) ->
  $117 = call $90 with $115 $34 ($13 is responsible)
  $118 = $117
}
$119 = { $120 (+ responsible $121) ->
  $122 = call $90 with $120 $33 ($12 is responsible)
  $123 = $122
}
$124 = { $125 $126 (+ responsible $127) ->
  $128 = call $119 with $125 ($17 is responsible)
  $129 = call $107 with $128 ($18 is responsible)
  $130 = call $51 with $129 $44 $127 ($19 is responsible)
  $131 = call $114 with $125 ($14 is responsible)
  $132 = call $51 with $131 $43 $127 ($15 is responsible)
  $133 = call $25 with $125 $126 ($16 is responsible)
  $134 = $133
}
$135 = { $136 (+ responsible $137) ->
  $138 = call $90 with $136 $36 ($21 is responsible)
  $139 = $138
}
$140 = { $141 (+ responsible $142) ->
  $143 = { $144 (+ responsible $145) ->
    $146 = call $135 with $144 ($5 is responsible)
    $147 = call $51 with $146 $41 $145 ($2 is responsible)
    $148 = call $28 with $141 $35 ($3 is responsible)
    $149 = call $124 with $148 $144 ($4 is responsible)
    $150 = $149
  }
  $151 = call $143 with $38 ($1 is responsible)
  $152 = $151
}
# Export map [Main: {...}]
$153 = [$31: $140]
... (lots of moduleStarts and moduleEnds expressions)
$1050 = $153

Note: The moduleStarts and moduleEnds expressions can be optimized away, but I thought the PR is big enough already (sorry about that btw).

Performance

In release mode, the optimization takes 6.3 seconds. That's definitely faster than use "Core" took before, but it's still a long way from performant. A quick look at the flamegraph reveals visit_with_visible dominates the runtime, which makes sense. In particular, to communicate all visible expressions to the visitor, I think all expressions are cloned. I'll have a look at that in another PR.

Still, the running of the "Hello, world!" program itself is now really fast. After maybe some performance tuning, AOT compilation might be interesting next.

image

Copy link
Member

@JonasWanke JonasWanke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could move the optimizations into compiler::mir::optimize to make it clear that these operate on the MIR

compiler/src/compiler/error.rs Outdated Show resolved Hide resolved
compiler/src/vm/tracer/stack_trace.rs Outdated Show resolved Hide resolved
compiler/src/vm/tracer/stack_trace.rs Outdated Show resolved Hide resolved
compiler/src/vm/tracer/stack_trace.rs Outdated Show resolved Hide resolved
compiler/src/vm/tracer/stack_trace.rs Outdated Show resolved Hide resolved
compiler/src/compiler/hir_to_mir.rs Outdated Show resolved Hide resolved
compiler/src/compiler/optimize/module_stack_cancelling.rs Outdated Show resolved Hide resolved
compiler/src/compiler/optimize/constant_lifting.rs Outdated Show resolved Hide resolved
compiler/src/compiler/optimize/constant_lifting.rs Outdated Show resolved Hide resolved
compiler/src/compiler/optimize/constant_lifting.rs Outdated Show resolved Hide resolved
Copy link
Member

@JonasWanke JonasWanke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could move the optimizations into compiler::mir::optimize to make it clear that these operate on the MIR.

I viewed all files of the first 100 commits (until “Add optimize_opinionated method” (c12b19b))

compiler/src/compiler/optimize/inlining.rs Outdated Show resolved Hide resolved
compiler/src/compiler/optimize/inlining.rs Outdated Show resolved Hide resolved
compiler/src/compiler/optimize/inlining.rs Outdated Show resolved Hide resolved
compiler/src/compiler/optimize/inlining.rs Outdated Show resolved Hide resolved
compiler/src/compiler/optimize/inlining.rs Outdated Show resolved Hide resolved
compiler/src/vm/fiber.rs Outdated Show resolved Hide resolved
compiler/src/vm/fiber.rs Outdated Show resolved Hide resolved
compiler/src/vm/fiber.rs Outdated Show resolved Hide resolved
compiler/src/vm/fiber.rs Outdated Show resolved Hide resolved
compiler/src/main.rs Outdated Show resolved Hide resolved
MarcelGarus and others added 2 commits November 24, 2022 20:59
Co-Authored-By: Jonas Wanke <contact@wanke.dev>
Co-Authored-By: Jonas Wanke <contact@wanke.dev>
@MarcelGarus
Copy link
Member Author

MarcelGarus commented Nov 24, 2022

Regarding moving the optimize module: Rather than having the optimizations as a submodule of the mir module, I'd rename the optimize module to optimize_mir – it's on a similar level to conversion modules such as hir_to_mir, just that it goes from MIR to MIR. What's your opinion?

compiler/src/compiler/mir.rs Outdated Show resolved Hide resolved
compiler/src/compiler/mir.rs Outdated Show resolved Hide resolved
compiler/src/language_server/hints/constant_evaluator.rs Outdated Show resolved Hide resolved
compiler/src/language_server/hints/fuzzer.rs Outdated Show resolved Hide resolved
compiler/src/language_server/hints/mod.rs Outdated Show resolved Hide resolved
compiler/src/language_server/hints/mod.rs Outdated Show resolved Hide resolved
packages/Core/text.candy Show resolved Hide resolved
Co-Authored-By: Jonas Wanke <contact@wanke.dev>
@JonasWanke JonasWanke merged commit b709ace into main Dec 1, 2022
@JonasWanke JonasWanke deleted the optimize branch December 1, 2022 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P: Compiler: Frontend Package: The compiler frontend P: Compiler: VS Code Extension Package: The Candy VS Code Extension P: Core Package: Candy's standard library T: Performance Type: Performance Improvements T: Refactor Type: Refactoring Z [Deprecated] P: Compiler/Hir [Deprecated] Package: Compiler/HIR (High-Level Intermediate Representation) Z [Deprecated] P: Compiler/Mir [Deprecated] Package: Compiler/Mir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants