-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add flash attention to speedup fused_gate_attention. #52731
Add flash attention to speedup fused_gate_attention. #52731
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
e58d315
to
5bee3a6
Compare
0cba876
to
9f76b5f
Compare
…sy/Paddle into add_flash_attn_for_af2
d73c078
to
c98186c
Compare
c98186c
to
ade7a07
Compare
ade7a07
to
1ddf939
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
c740bdc
to
3b97303
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
7ff9f5e
cmake/external/flashattn.cmake
Outdated
@@ -19,8 +19,8 @@ add_definitions(-DPADDLE_WITH_FLASHATTN) | |||
set(FLASHATTN_PREFIX_DIR ${THIRD_PARTY_PATH}/flashattn) | |||
set(FLASHATTN_SOURCE_SUBDIR csrc/flash_attn) | |||
set(FLASHATTN_INSTALL_DIR ${THIRD_PARTY_PATH}/install/flashattn) | |||
set(FLASHATTN_REPOSITORY ${GIT_URL}/PaddlePaddle/flash-attention.git) | |||
set(FLASHATTN_TAG 5ff4bbf56ad066750407c4aef16ac740ebda0717) | |||
set(FLASHATTN_REPOSITORY ${GIT_URL}/Xreki/flash-attention.git) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use personal repo ?
cc @sneaxiy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
目前因包体积增大的比较多,在调试中,这是中间状态。
b9debab
to
bee8537
Compare
PR types
Performance optimization
PR changes
OPs
Description
Pcard-70461
The last optimization for AlphaFold models, which increasing the performance of AlphaFold2 model from 2.99s/iter to 2.796s/iter.
Rely on : Opitmization for AlphaFold2 model flash-attention#4