-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
@zheng-da @szha @eric-haibin-lin Please review |
@mxnet-label-bot add [Operator, pr-awaiting-review] |
befeddb
to
19e00c6
Compare
Great! Thanks for your contribution! |
The old code supports OpReq, but it is cancelled in this PR. |
@wkcn I don't think ReqType really matters here, cause kAddTo for grad would hardly make any sense... |
So I think it is better to check whether OpReq is kWrite or kInplace. |
What if this op is used during training where gradient accumulation is used? |
@wkcn kWriteTo and kWriteInplace actually share the same code in KERNEL_ASSIGN: /~https://github.com/apache/incubator-mxnet/blob/master/src/operator/mxnet_op.h#L304-L319 so no need to distinguish between them, I'll add the kAddTo support as per @szha's request. |
@haojin2 Sorry that I forgot it. After adding the kAddTo, LGTM : ) |
0036fee
to
c4b4cf4
Compare
I think we should CHECK(req[0]==kWriteTo||req[0]==kInplace); in the forward function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution!
Hope add CHECK(req[0]==kWriteTo||req[0]==kInplace); in the forward function.
4e181d9
to
0cc9d04
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! I think it is ready to merge.
Thank you for your contribution!
* speedup _contrib_index_copy * use copy in backward * add support for kAddTo req type for grad * change template to argument for req types
* speedup _contrib_index_copy * use copy in backward * add support for kAddTo req type for grad * change template to argument for req types
* speedup _contrib_index_copy * use copy in backward * add support for kAddTo req type for grad * change template to argument for req types
Description
Re-writing the Map kernel of contrib.index_copy to speed up index_copy operator for DGL usage.
Plus a small fix to operator registration to fix problem with symbolic API.
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments
benchmark results:
CPU
GPU: