-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reimplement logical-scheduler walker more efficiently #741
Conversation
Here's timing output from
NOTE: the code to run above benchmark data is in a pending PR (#733). However, although running the specific benchmark relies on that PR, this PR does NOT depend on that PR. NOTE 2: Environment info:
|
FYI, here are charts (as PNG files) showing the main performance improvements of this optimization: Based on CSV data here: |
Super cool! Over 2x improvement in a lot of benchmarks. :D |
Wunderbar! |
@@ -179,14 +179,22 @@ | |||
(last-instrs :initform nil | |||
:accessor lscheduler-last-instrs | |||
:documentation "List of the instructions appearing at the \"bottom\" of a logical scheduler. These are sorted topologically ascending: earlier items in the list come logically after deeper items in the list.") | |||
(later-instrs :initform (make-hash-table :test #'eql) | |||
(later-instrs :initform (make-instr-hash-table) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO make-instruction-hash-table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though I guess instr
is commonly used in the lscheduler
code, so 🤷🏽
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤷🏽 here too, but I found instr standalone was more often used than the spelled out variant.
@mhdavid-hrl I'm ready to merge. All good? |
Could I just squash the typo commit with the base commit and eliminate the 1-line commit message, just for a slightly nicer history? |
Yes of course; go ahead. |
…verse DAG Lscheduler-walk-graph was reimplemented. It now runs a fairly standard DFS-based topological ordering, but this is done on the reverse DAG that's embodied by the structure, allowing the functions at each node to be called without allocating any storage that the GC has to deal with. This DFS uses an explicit stack, not recursion. A couple of earlier attempts to do a DFS-based topological sorting failed in extreme cases: - a straightforward recursive-function based DFS: would die with Lisp stack overflow for huge graphs - a straightforward explicit-stack DFS on the forward DAG, i.e., collecting a topo-sorted list in reverse order: the list destined to become garbage would overwhelm SBCL GC, resulting in "heap exhausted during garbage collection" errors in extreme cases (e.g., some 600K+ node graph encountered in qasm benchmmarks). Besides the new walker, a few small tweaks in this file: - lscheduler-all-instructions implemented more efficiently - lscheduler-calculate-log-fidelity tweaked to call binding-from-instr less - lscheduler-calculate-volume doc string added
Done with rebase/squash/fix commit log, and all checks have passed. Good to merge now. |
Just adding here that I just noticed this partially addressed issue "Clean up lscheduler-walk-graph #96". |
Lscheduler-walk-graph was reimplemented. It now runs a fairly standard
DFS-based topological ordering, but this is done on the reverse DAG
that's embodied by the structure, allowing the functions at each node
to be called without allocating any storage that the GC has to deal
with. This DFS uses an explicit stack, not recursion. A couple of
earlier attempts to do a DFS-based topological sorting failed in
extreme cases:
a straightforward recursive-function based DFS: would die with Lisp
stack overflow for huge graphs
a straightforward explicit-stack DFS on the forward DAG, i.e.,
collecting a topo-sorted list in reverse order: the list destined to
become garbage would overwhelm SBCL GC, resulting in "heap exhausted
during garbage collection" errors in extreme cases (e.g., some 600K+
node graph encountered in qasm benchmmarks).
Besides the new walker, a few small tweaks in this file: