-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support gpu mixed precision inference #40531
Conversation
Thanks for your contribution! |
/// \brief Turn on GPU fp16 precision. | ||
/// | ||
/// | ||
void EnableUseGpuFp16(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exp_EnableUseGpuFp16 接口名未来可能会更改为通用的EnableFp16
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
/// | ||
/// \param op_list The operator type list. | ||
/// | ||
void SetGpuFp16DisabledOp(std::unordered_set<std::string> op_list) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是否把该接口参数放入Exp_EnableUseGpuFp16中比较合适?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
"conv_elementwise_add_fuse_pass", // | ||
#endif // | ||
"transpose_flatten_concat_fuse_pass", // | ||
"mixed_precision_configure_pass", // |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
能否和GpuPassStrategy默认构造用同一份pass列表呢?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
有些算子融合之后没有fp16的kernel,例如fc_fuse_pass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO:正式发板前,完善代码+补充文档。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
Others
PR changes
Others
Describe
支持GPU混合精度推理