-
Notifications
You must be signed in to change notification settings - Fork 1k
Issues: NVIDIA/cutlass
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[QST] Terminology question on GMMA::ScaleOut::One
? - Needs Triage
question
Question
#2046
opened Jan 17, 2025 by
haeunlee99
[FEA] Does it supports quantization-matrix-mul?
? - Needs Triage
feature request
New feature or request
#2044
opened Jan 17, 2025 by
bianxuxuxu
[BUG][QST] Hopper Grouped GEMM Fails When Workspace not aligned at 64, but MinWorkspaceAlignment =16
? - Needs Triage
bug
Something isn't working
#2042
opened Jan 16, 2025 by
ankutalev
[BUG] Modifying the block/warptile shapes and the output datatype in the unit test causes the tests to fail.
? - Needs Triage
bug
Something isn't working
#2041
opened Jan 16, 2025 by
xiaonans
[QST] link invalid in efficient_gemm.md
? - Needs Triage
question
Question
#2038
opened Jan 13, 2025 by
unship
[QST]Question about the picture in documentation Question
Efficient GEMM in CUDA
? - Needs Triage
question
#2034
opened Jan 9, 2025 by
sleepwalker2017
[BUG] Logic issue in nondeterministic reduction mode of Stream-K tile scheduler.
? - Needs Triage
bug
Something isn't working
#2027
opened Jan 7, 2025 by
allispaul
[QST] What is API version compatibility?
? - Needs Triage
question
Question
#2025
opened Jan 6, 2025 by
ZzEeKkAa
[QST] why have Int<2>{} in coalesce_x function when last shape value equal to constant one.
? - Needs Triage
question
Question
#2023
opened Jan 5, 2025 by
Shan19900305
[QST] why the implementation of f16xs8 mixed gemm is different between TRT-LLM and native cutlass mixed gemm example?
? - Needs Triage
question
Question
#2022
opened Jan 5, 2025 by
danielhua23
[BUG] Memory corruption/undefined behavior on GemmUniversal in 3.4.0 - 3.6.0 🐛
? - Needs Triage
bug
Something isn't working
#2017
opened Dec 28, 2024 by
warpuv
[QST]Why Does CUTLASS Use 3-4-3 Swizzle?
? - Needs Triage
question
Question
#2015
opened Dec 27, 2024 by
ziyuhuang123
[BUG] [QST] Regression - why Sm90RowBroadcast in 3.5.1 stops support smem usage?
? - Needs Triage
bug
Something isn't working
#2010
opened Dec 23, 2024 by
ankutalev
[BUG] Removal of OpMultiplyAdd template substitutions from mma_sm80.h in 3.5.1
? - Needs Triage
bug
Something isn't working
#2009
opened Dec 23, 2024 by
ankutalev
[QST]How Does TMA Work in CUTLASS for Writing from Shared Memory to Global Memory?
? - Needs Triage
question
Question
#2008
opened Dec 23, 2024 by
ziyuhuang123
[BUG] wmma should be enabled w/ clang.
? - Needs Triage
bug
Something isn't working
#2006
opened Dec 20, 2024 by
Artem-B
[BUG] Unaligned access in test/unit/gemm/threadblock/batched_gemv.cu
? - Needs Triage
bug
Something isn't working
#2003
opened Dec 19, 2024 by
Artem-B
[QST]Behavior of TMA Store and Wait Mechanism in CUTLASS
? - Needs Triage
question
Question
#2002
opened Dec 19, 2024 by
ziyuhuang123
[QST] When to use MainloopSm90TmaGmmaWarpSpecializedFP8?
? - Needs Triage
question
Question
#2001
opened Dec 19, 2024 by
ginowu
[Proposal] layout deduction ambiguity of Nested Layout Access Problem
? - Needs Triage
bug
Something isn't working
#2000
opened Dec 18, 2024 by
yiakwy-xpu-ml-framework-team
[QST]Is the Key Difference Between mbarrier and barrier Their Handling of Producer-Consumer Count?
? - Needs Triage
inactive-30d
question
Question
#1999
opened Dec 18, 2024 by
ziyuhuang123
[QST]How to Handle Synchronization with Different Thread Counts for Producer and Consumer in CUTLASS?
? - Needs Triage
inactive-30d
question
Question
#1998
opened Dec 18, 2024 by
ziyuhuang123
[BUG] calling cast_smem_ptr_to_uint(device fn) from make_gmma_desc(host device fn) is not allowed
? - Needs Triage
bug
Something isn't working
inactive-30d
#1997
opened Dec 18, 2024 by
lygztq
[QST] Gemm got 'incomplete type is not allowed' when use Sm90
? - Needs Triage
inactive-30d
question
Question
#1996
opened Dec 18, 2024 by
TopIdiot
[QST] custom kernel integrated in Pytorch
? - Needs Triage
inactive-30d
question
Question
#1991
opened Dec 16, 2024 by
IzanCatalan
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.