reimplement logical-scheduler walker more efficiently #741

ghost · 2021-10-12T20:14:55Z

Lscheduler-walk-graph was reimplemented. It now runs a fairly standard
DFS-based topological ordering, but this is done on the reverse DAG
that's embodied by the structure, allowing the functions at each node
to be called without allocating any storage that the GC has to deal
with. This DFS uses an explicit stack, not recursion. A couple of
earlier attempts to do a DFS-based topological sorting failed in
extreme cases:

a straightforward recursive-function based DFS: would die with Lisp
stack overflow for huge graphs
a straightforward explicit-stack DFS on the forward DAG, i.e.,
collecting a topo-sorted list in reverse order: the list destined to
become garbage would overwhelm SBCL GC, resulting in "heap exhausted
during garbage collection" errors in extreme cases (e.g., some 600K+
node graph encountered in qasm benchmmarks).

Besides the new walker, a few small tweaks in this file:

lscheduler-all-instructions implemented more efficiently
lscheduler-calculate-log-fidelity tweaked to call binding-from-instr less
lscheduler-calculate-volume doc string added

ghost · 2021-10-12T20:17:40Z

Here's timing output from make benchmark-nq for NEW WALKER:

+----------------------------+----------+-------+----------+
|        NAME                | TIME (s) | SWAPS | 2Q DEPTH |
+----------------------------+----------+-------+----------+
| STATIC/FULLY-CONNECTED 10  |     2.32 |     0 |        2 |
| STATIC/FULLY-CONNECTED 30  |    17.67 |     0 |        2 |
| STATIC/FULLY-CONNECTED 50  |    50.30 |     0 |        2 |
| STATIC/FULLY-CONNECTED 70  |   102.46 |     0 |        2 |
+----------------------------+----------+-------+----------+
+----------------------------+----------+-------+----------+
|        NAME                | TIME (s) | SWAPS | 2Q DEPTH |
+----------------------------+----------+-------+----------+
| BELL/FULLY-CONNECTED 10    |     4.72 |     0 |        9 |
| BELL/FULLY-CONNECTED 30    |    31.36 |     0 |       29 |
| BELL/FULLY-CONNECTED 50    |    70.14 |     0 |       49 |
| BELL/FULLY-CONNECTED 70    |   128.42 |     0 |       69 |
+----------------------------+----------+-------+----------+
+----------------------------+----------+-------+----------+
|        NAME                | TIME (s) | SWAPS | 2Q DEPTH |
+----------------------------+----------+-------+----------+
| QFT/FULLY-CONNECTED 10     |     9.42 |     0 |       37 |
| QFT/FULLY-CONNECTED 30     |    88.79 |     0 |      117 |
| QFT/FULLY-CONNECTED 50     |   298.36 |     0 |      197 |
| QFT/FULLY-CONNECTED 70     |   638.87 |     0 |      277 |
+----------------------------+----------+-------+----------+
+----------------------------+----------+-------+----------+
|        NAME                | TIME (s) | SWAPS | 2Q DEPTH |
+----------------------------+----------+-------+----------+
| HADAMARD/FULLY-CONNECTED 10 |     1.63 |     0 |        0 |
| HADAMARD/FULLY-CONNECTED 30 |    15.79 |     0 |        0 |
| HADAMARD/FULLY-CONNECTED 50 |    44.58 |     0 |        0 |
| HADAMARD/FULLY-CONNECTED 70 |    91.12 |     0 |        0 |
+----------------------------+----------+-------+----------+
+----------------------------+----------+-------+----------+
|        NAME                | TIME (s) | SWAPS | 2Q DEPTH |
+----------------------------+----------+-------+----------+
| STATIC/LINEAR 10           |     0.69 |     0 |        2 |
| STATIC/LINEAR 30           |     1.41 |     0 |        2 |
| STATIC/LINEAR 50           |     2.19 |     0 |        2 |
| STATIC/LINEAR 70           |     2.99 |     0 |        2 |
+----------------------------+----------+-------+----------+
+----------------------------+----------+-------+----------+
|        NAME                | TIME (s) | SWAPS | 2Q DEPTH |
+----------------------------+----------+-------+----------+
| BELL/LINEAR 10             |     2.57 |     0 |        9 |
| BELL/LINEAR 30             |     8.90 |     0 |       29 |
| BELL/LINEAR 50             |    16.58 |     0 |       49 |
| BELL/LINEAR 70             |    25.02 |     0 |       69 |
+----------------------------+----------+-------+----------+
+----------------------------+----------+-------+----------+
|        NAME                | TIME (s) | SWAPS | 2Q DEPTH |
+----------------------------+----------+-------+----------+
| QFT/LINEAR 10              |     5.64 |    61 |       79 |
| QFT/LINEAR 30              |    16.60 |   641 |      265 |
| QFT/LINEAR 50              |    59.71 |  1821 |      451 |
| QFT/LINEAR 70              |   153.12 |  3601 |      637 |
+----------------------------+----------+-------+----------+
+----------------------------+----------+-------+----------+
|        NAME                | TIME (s) | SWAPS | 2Q DEPTH |
+----------------------------+----------+-------+----------+
| HADAMARD/LINEAR 10         |     0.06 |     0 |        0 |
| HADAMARD/LINEAR 30         |     0.26 |     0 |        0 |
| HADAMARD/LINEAR 50         |     0.59 |     0 |        0 |
| HADAMARD/LINEAR 70         |     1.12 |     0 |        0 |
+----------------------------+----------+-------+----------+

NOTE: the code to run above benchmark data is in a pending PR (#733). However, although running the specific benchmark relies on that PR, this PR does NOT depend on that PR.

NOTE 2: Environment info:

Run on Fri Sep 17 20:40:24 PDT 2021
Hardware:
  Model Name:	MacBook Pro
  Model Identifier:	MacBookPro16,1
  Processor Name:	6-Core Intel Core i7
  Processor Speed:	2.6 GHz
  Number of Processors:	1
  Total Number of Cores:	6
  L2 Cache (per Core):	256 KB
  L3 Cache:	12 MB
Software:
  System Version:	macOS 10.15.7 (19H1217)
  Kernel Version:	Darwin 19.6.0

ghost · 2021-10-12T20:26:24Z

FYI, here are charts (as PNG files) showing the main performance improvements of this optimization:

Based on CSV data here:
benchmark-nq-data.csv

stylewarning · 2021-10-12T22:06:58Z

Super cool! Over 2x improvement in a lot of benchmarks. :D

notmgsk · 2021-10-12T22:18:55Z

Wunderbar!

src/addresser/logical-schedule.lisp

notmgsk · 2021-10-12T22:30:15Z

src/addresser/logical-schedule.lisp

@@ -179,14 +179,22 @@
   (last-instrs :initform nil
                :accessor lscheduler-last-instrs
                :documentation "List of the instructions appearing at the \"bottom\" of a logical scheduler.  These are sorted topologically ascending: earlier items in the list come logically after deeper items in the list.")
-   (later-instrs :initform (make-hash-table :test #'eql)
+   (later-instrs :initform (make-instr-hash-table)


IMO make-instruction-hash-table

Though I guess instr is commonly used in the lscheduler code, so 🤷🏽

🤷🏽 here too, but I found instr standalone was more often used than the spelled out variant.

stylewarning · 2021-10-19T20:12:00Z

@mhdavid-hrl I'm ready to merge. All good?

ghost · 2021-10-19T20:44:02Z

@mhdavid-hrl I'm ready to merge. All good?

Could I just squash the typo commit with the base commit and eliminate the 1-line commit message, just for a slightly nicer history?

stylewarning · 2021-10-19T21:32:07Z

@mhdavid-hrl I'm ready to merge. All good?

Could I just squash the typo commit with the base commit and eliminate the 1-line commit message, just for a slightly nicer history?

Yes of course; go ahead.

…verse DAG Lscheduler-walk-graph was reimplemented. It now runs a fairly standard DFS-based topological ordering, but this is done on the reverse DAG that's embodied by the structure, allowing the functions at each node to be called without allocating any storage that the GC has to deal with. This DFS uses an explicit stack, not recursion. A couple of earlier attempts to do a DFS-based topological sorting failed in extreme cases: - a straightforward recursive-function based DFS: would die with Lisp stack overflow for huge graphs - a straightforward explicit-stack DFS on the forward DAG, i.e., collecting a topo-sorted list in reverse order: the list destined to become garbage would overwhelm SBCL GC, resulting in "heap exhausted during garbage collection" errors in extreme cases (e.g., some 600K+ node graph encountered in qasm benchmmarks). Besides the new walker, a few small tweaks in this file: - lscheduler-all-instructions implemented more efficiently - lscheduler-calculate-log-fidelity tweaked to call binding-from-instr less - lscheduler-calculate-volume doc string added

ghost · 2021-10-19T22:15:32Z

Done with rebase/squash/fix commit log, and all checks have passed. Good to merge now.

ghost · 2021-11-10T01:10:42Z

Just adding here that I just noticed this partially addressed issue "Clean up lscheduler-walk-graph #96".

stylewarning requested review from braised-babbage, notmgsk and stylewarning October 12, 2021 22:05

notmgsk reviewed Oct 12, 2021

View reviewed changes

src/addresser/logical-schedule.lisp Outdated Show resolved Hide resolved

notmgsk reviewed Oct 12, 2021

View reviewed changes

stylewarning previously approved these changes Oct 19, 2021

View reviewed changes

ghost dismissed stylewarning’s stale review via 3d4d0cb October 19, 2021 21:59

stylewarning merged commit d84d198 into quil-lang:master Oct 21, 2021

ghost mentioned this pull request Nov 10, 2021

Clean up lscheduler-walk-graph #96

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reimplement logical-scheduler walker more efficiently #741

reimplement logical-scheduler walker more efficiently #741

ghost commented Oct 12, 2021

ghost commented Oct 12, 2021 •

edited by ghost

Loading

ghost commented Oct 12, 2021

stylewarning commented Oct 12, 2021

notmgsk commented Oct 12, 2021

notmgsk Oct 12, 2021

notmgsk Oct 12, 2021

ghost Oct 12, 2021

stylewarning commented Oct 19, 2021

ghost commented Oct 19, 2021 •

edited by ghost

Loading

stylewarning commented Oct 19, 2021

ghost commented Oct 19, 2021

ghost commented Nov 10, 2021

reimplement logical-scheduler walker more efficiently #741

reimplement logical-scheduler walker more efficiently #741

Conversation

ghost commented Oct 12, 2021

ghost commented Oct 12, 2021 • edited by ghost Loading

ghost commented Oct 12, 2021

stylewarning commented Oct 12, 2021

notmgsk commented Oct 12, 2021

notmgsk Oct 12, 2021

Choose a reason for hiding this comment

notmgsk Oct 12, 2021

Choose a reason for hiding this comment

ghost Oct 12, 2021

Choose a reason for hiding this comment

stylewarning commented Oct 19, 2021

ghost commented Oct 19, 2021 • edited by ghost Loading

stylewarning commented Oct 19, 2021

ghost commented Oct 19, 2021

ghost commented Nov 10, 2021

ghost commented Oct 12, 2021 •

edited by ghost

Loading

ghost commented Oct 19, 2021 •

edited by ghost

Loading