Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding initial support for gfx950 #1710

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from
Open

Conversation

yoichiyoshida
Copy link
Contributor

  • Add new MFMA instructions for gfx950
  • Enable F8 OCP for gfx950
  • Enable sparse gfx950 SMFMA tiles
  • Enable f8/b8 spmm
  • Added V_PRNG instruction to CodeGen
  • Add ds_read_b64_tr_b16 for gfx950
  • add gfx950 sparse fp16, bf16 and int8
  • Added Scaled AB GEMMs
  • added liblogic of gfx950 F8/BF8/F8BF8/BF8F8 input B/H/S output types
  • Add f8 macro guards for post gsu kernels
  • Incorporate BF16 CVT into hipBLASLt
  • Add support of F8/F16/BF16 datatypes for gfx950
  • Added Hascvtf16_fp8 and Hascvtfp8_f16 asmCaps
  • Using F8->F16 and F16->F8 CVT instructions instead of round trip via F32 for gfx950

   * Add new MFMA instructions for gfx950
   * Enable F8 OCP for gfx950
   * Enable sparse gfx950 SMFMA tiles
   * Enable f8/b8 spmm
   * Added V_PRNG instruction to CodeGen
   * Add ds_read_b64_tr_b16 for gfx950
   * add gfx950 sparse fp16, bf16 and int8
   * Added Scaled AB GEMMs
   * added liblogic of gfx950 F8/BF8/F8BF8/BF8F8 input B/H/S output types
   * Add f8 macro guards for post gsu kernels
   * Incorporate BF16 CVT into hipBLASLt
   * Add support of F8/F16/BF16 datatypes for gfx950
   * Added  Hascvtf16_fp8 and Hascvtfp8_f16 asmCaps
   * Using F8->F16 and F16->F8 CVT instructions instead of round trip via F32 for gfx950

Co-authored-by: Majed Sujon <mdmsujon@amd.com>
Co-authored-by: Aditya Joshi <Aditya.Joshi@amd.com>
Co-authored-by: Zhongze Li <ZHONGZE.LI@amd.com>
Co-authored-by: Vin Huang <vin.huang@amd.com>
Co-authored-by: Feroz KamalMustafa <Feroz.KamalMustafa@amd.com>
Co-authored-by: Babak Poursartip <Babak.Poursartip@amd.com>
Co-authored-by: Brian Shi <Brian.Shi@amd.com>
Co-authored-by: YangWen Huang <YANGWEN.HUANG@AMD.COM>
Co-authored-by: Ethan Lin <Yi-Chen.Lin@amd.com>
Co-authored-by: Meng-Zhe Cai <meng-zhe.cai@amd.com>
Undo reduced target list
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants