-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Phi] Support cudnn kernel moving & move softmax kernels #39547
[Phi] Support cudnn kernel moving & move softmax kernels #39547
Conversation
Thanks for your contribution! |
… pten/support_cudnn_kernel_moving
…/chenwhql/Paddle into pten/support_cudnn_kernel_moving
…/chenwhql/Paddle into pten/support_cudnn_kernel_moving
… pten/support_cudnn_kernel_moving
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for op benchmark
public: | ||
using Type = float; | ||
}; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里和 paddle/fluid/operators/amp/fp16_type_traits.h 重复了,后边可以考虑合并。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
嗯嗯,这是移过来的,后续不推荐使用fluid下的paddle/fluid/operators/amp/fp16_type_traits.h,可以删除这个文件
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for op benchmark
PR types
New features
PR changes
Others
Describe
[Phi] Support cudnn kernel moving
支持phi注册GPUDNN的kernel,迁移softmax op kernel作为验证
由于并没有单独的GPUDNNContext,这给目前的体系引入了一些问题,当我们using CUDNNContext = GPUContext时,如果gpu和cudnn的kernel函数命名是一样的,会导致实例化冲突,例如:
SoftmaxKernel<float, GPUContext>
和SoftmaxKernel<float, GPUDNNContext>
实质上是一个实例化,但它们的实现是不一样的,这导致编译时,GPUDNN的kernel实例化被覆盖了,实际上GPUDNN注册的是GPU的kernel,在运行时引入了计算结果错误。由于目前GPUDNNContext的设计并不明确,和现在的GPUContext本质上就是一样的,所以暂时通过更改函数名解决冲突,GPUDNN的kernel命名为SoftmaxGPUDNNKernel。
但这样引入的问题是,通过
SoftmaxKernel<T, Context>
将无法调用到SoftmaxGPUDNNKernel,gpudnn的kernel自成一个小体系,但影响不大,cudnn kernel的数目极少(个位数)。这里后续需要确认下,SoftmaxKernel原先的cuda kernel是否还有必要保留,如果没有必要的话,将其删除,并将cudnn kernel移到gpu下会合适一些。TODO: