forked from apache/mxnet
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Performance improvement in ToTensor GPU Kernel (apache#14099)
* CPU implementation without Kernel launch/map * Optimal CUDA support for 3D ToTensor operator * Add CUDA kernel for 4D inputs * Fix failing CPU tests for totensor * disable warning on windows * try fix in instance norm windows build failure * Guard omp parallel collapse for windows * Remove warning supression to check if it is ok * fix lint issues * Address code review comments
- Loading branch information
1 parent
4c83048
commit 45cb885
Showing
2 changed files
with
153 additions
and
30 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters