-
Notifications
You must be signed in to change notification settings - Fork 950
Optimize concat for the WebGL backend #1449
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay! Are there any other ops that work like concat used to that we could quickly optimize?
Good q. A quick search for ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r1.
Reviewable status: complete! 1 of 1 approvals obtained (waiting on @dsmilkov and @nsthorat)
src/kernels/backend_webgl.ts, line 632 at r1 (raw file):
} if (tensors.length > ENV.get('WEBGL_MAX_TEXTURES_IN_SHADER')) { const midIndex = tensors.length >> 2;
can you use division instead of bitshifting here for readability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 2 of 1 approvals obtained
src/kernels/backend_webgl.ts, line 632 at r1 (raw file):
Previously, nsthorat (Nikhil Thorat) wrote…
can you use division instead of bitshifting here for readability?
Done.
Speedup WebGL concat by 5x without warmup and 2x with shader warmup.
Benchmark used:
Without warmup: 626ms (this PR) vs 3066ms (master)
With warmup: 34ms (this PR) vs 65ms (master)
The benchmark above reflects a real workflow of preparing training data (stacking 400 examples, 696 values each), taken from the Audio recognition codelab. Measuring both with warmup and without warmup is important since in the use-case of collecting examples, the number of examples is dynamic and causes recompilation of the shaders.
Details
WEBGL_MAX_TEXTURES_IN_SHADER
that gives the maximum number of textures we can have as uniform samples in a single shader.PERF
This change is