-
Notifications
You must be signed in to change notification settings - Fork 950
Optimize addN op by the multiple upload of textures #1538
Conversation
17158f2
to
93db4b1
Compare
src/kernels/webgl/addn_gpu.ts
Outdated
@@ -0,0 +1,51 @@ | |||
/** | |||
* @license | |||
* Copyright 2017 Google Inc. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2019 LLC
Looks good overall, one small note is that this isn't going to be supported when packing is enabled. If you feel motivated you can add the packed version of this, otherwise we can do this in a follow up and this LGTM :) |
@nsthorat Thanks! Is there any good reference (or code) to implement packed version? |
Check out the code here for binary ops that are packed: /~https://github.com/tensorflow/tfjs-core/blob/master/src/kernels/webgl/binaryop_packed_gpu.ts |
@nsthorat Thank you! Try to take a look. |
Let me know when this is ready to review :) |
@nsthorat Ah, sorry. It's basically ready for the review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r1, 2 of 3 files at r2.
Reviewable status: 0 of 1 approvals obtained (waiting on @Lewuathe)
src/kernels/webgl/addn_gpu.ts, line 3 at r1 (raw file):
Previously, nsthorat (Nikhil Thorat) wrote…
2019 LLC
This license still needs updating 2019 Google LLC.
src/kernels/webgl/addn_packed_gpu.ts, line 3 at r2 (raw file):
/** * @license * Copyright 2017 Google Inc. All Rights Reserved.
2019 Google LLC
src/kernels/webgl/backend_webgl.ts, line 1474 at r2 (raw file):
addN<T extends Tensor>(tensors: T[]): T { if (tensors.length === 1) { return tensors[0];
return tensors[0].clone()
src/kernels/webgl/backend_webgl.ts, line 1490 at r2 (raw file):
const shapes = tensors.map(t => t.shape); // We can make sure shapes are identical in op level. const usePackedOp = ENV.getBool('WEBGL_PACK_BINARY_OPERATIONS');
this isn't really a binary operation, I think we might want to make a new flag for this
src/kernels/webgl/backend_webgl.ts
Outdated
.reduce((d1, d2) => upcastType(d1, d2)); | ||
const shapes = tensors.map(t => t.shape); | ||
// We can make sure shapes are identical in op level. | ||
const usePackedOp = ENV.getBool('WEBGL_PACK_BINARY_OPERATIONS'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just check WEBGL_PACK
here rather than checking WEBGL_PACK_BINARY_OPERATIONS
or creating a new flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out!
@nsthorat @annxingyuan I updated it accordingly. Please take a look when you get a chance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r4.
Reviewable status: complete! 1 of 1 approvals obtained (waiting on @annxingyuan and @Lewuathe)
PERF
As well as the optimization for concat op, we can optimize
addN
op by uploading multiple textures up toWEBGL_MAX_TEXTURES_IN_SHADER
so that we can reduce the overhead of texture upload.Here is the benchmark of optimized
addN
ops. We can see the significant performance improvement in case of over 4000 tensors.Environment
See: tensorflow/tfjs#989
This change is