-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Add primitive cache for MKL-DNN sum(elemwise_add operator #14914
Conversation
Thanks for the improvements. Could you share the performance data? |
@mxnet-label-bot add [pr-awaiting-review, MKLDNN] |
Here's some performance data I collected based on a customer case, primitive cached brings ~1.6x speedup for 1) Don't cached sum primitive (current code base)
2) cached sum primitive
|
int ndim = input.shape().ndim(); | ||
return input.dtype() == mshadow::kFloat32 && (ndim >= 1 && ndim <= 4) && | ||
input.storage_type() == kDefaultStorage; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't put this function into mkldnn_base-inl.h
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, some MKLDNN ops supports ndim [1, 4], while some ops still doesn't support ndim=3.
For now, there would be several such support check functions for different ops.
But indeed, we can combine all the similar function when all the ops are finalized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the common approach for MKL-DNN integration so it looks good.
Merging now. Thanks for your contribution. |
* Add primitive cache for mkldnn sum * fix cpp test failure
* Add primitive cache for mkldnn sum * fix cpp test failure
Description
This PR cached the created sum primitive to reduce the creation overhead, which is helpful to improve the performance of current MKL-DNN sum (elemwise_add) operator.
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments
@pengzhao-intel @TaoLv