HIP: Disable ROCWMMA fattn on CDNA when compiled against ROCWMMA 2.0.0 #16221

IMbackK · 2025-09-24T10:53:43Z

rocwmma 2.0.0 includes a bug in the code faking fp16 accumulation on CDNA

Current rocwmma as released with ROCM 7.0.0 and 7.0.1 includes an embarrassing compile time bug in the code that emulates fp16 accumulation via downcast on devices which do not support this in hardware.

This pr redesigns the conditions on which the WMMA fattn kernel is selected and avoids compiling and using the kernel on the following broken configurations:

CDNA and ROCWMMA 2.0.0
RDNA4 and ROCWMMA <2.0.0

IMbackK · 2025-09-24T10:57:47Z

ggml/src/ggml-hip/CMakeLists.txt

-    if (NOT ${FOUND_ROCWMMA})
-        message(FATAL_ERROR "rocwmma has not been found")
-    endif()
-endif()


This condition never worked, for CHECK_INCLUDE_FILE_CXX cmake generates a cpp file that includes the header and then compiles it with the cxx compiler and checks if the compile is successful. In this case the compile can never be successful as
rocwmma.hpp includes hip extensions to c++, therefore FOUND_ROCWMMA was never set.
This was then masked by the condition NOT ${FOUND_ROCWMMA} being wrong, it should be NOT FOUND_ROCWMMA as NOT ${FOUND_ROCWMMA} expands to NOT "" when FOUND_ROCWMMA is not set, which evaluates to TRUE

IMbackK · 2025-09-24T11:00:49Z

fixes #16153

IMbackK · 2025-09-24T11:03:49Z

Unfortunately we cant just accumulate @fp32 in the wmma kernel on cdna to avoid this bug, even though this would be more performant, as we dont have enough shared memory for this.
We could explore opportunistically doing so for the shapes where shared memory suffices.

IMbackK · 2025-09-24T11:24:10Z

Currently this cant build on ci, as the rocwmma installation on ci is incorrect.
Atm we simply clone the rocwmma repo:

llama.cpp/.github/workflows/build.yml

Line 1072 in 63b54c8

    
                     git clone https://github.com/rocm/rocwmma --branch rocm-${{ env.ROCM_VERSION }} --depth 1

and then use rocwmma, which is header implemented, from there.
as this way we dont run rocwmma's cmake build system, rocwmma-version.hpp never gets generated from https://github.com/ROCm/rocWMMA/blob/develop/library/include/rocwmma/internal/rocwmma-version.hpp.in, and thus this header is missing.

JohannesGaessler

FYI one of my long-term goals is to remove the WMMA kernel by expanding support for the mma kernel. The instructions that are still missing support are Volta tensor cores, AMD WMMA, and AMD MFMA. I'll need to think about how to organize my hardware, I'll definitely procure an RDNA GPU. For V100/Mi100 I'm not yet sure how to best obtain access.

ggml/src/ggml-cuda/fattn-wmma-f16.cuh

IMbackK · 2025-09-24T14:00:16Z

Do not commit this until the ci is fixed by properly installing rocwmma on ci. (pr for this will follow)

IMbackK · 2025-09-24T14:10:10Z

FYI one of my long-term goals is to remove the WMMA kernel by expanding support for the mma kernel. The instructions that are still missing support are Volta tensor cores, AMD WMMA, and AMD MFMA. I'll need to think about how to organize my hardware, I'll definitely procure an RDNA GPU. For V100/Mi100 I'm not yet sure how to best obtain access.

@deepsek You have previously expressed interest in adding MFMA support to the fattn mma path, it would be helpful if you could share your current plans in this direction, if any.

deepsek · 2025-09-24T20:05:24Z

@IMbackK, I was targeting a November PR to address fattn for MMA along with some other changes. But I'm currently stretched thin with other open-source projects. We might be delayed until 2026. If anyone in the community is taking up this effort, I would be happy to assist with issues!

rocwmma 2.0.0 includes a bug in the code fakeing fp16 accumulation on CDNA

IMbackK · 2025-10-01T21:05:13Z

@slaren It seams @JohannesGaessler's approval is no longer sufficant, I belive due to the recent changes to codeowners, or perhaps some other configuration change i'm not aware of.

slaren · 2025-10-01T21:08:44Z

The number of people with write access has been reduced, see #16113 for more details. Merging based on Johannes' approval.

…MMA 2.0.0 (ggml-org#16221)" This reverts commit e95fec6.

IMbackK requested review from JohannesGaessler and slaren as code owners September 24, 2025 10:53

IMbackK commented Sep 24, 2025

View reviewed changes

IMbackK changed the title ~~HIP: Disable ROCWMMA fatt on CDNA when compiled against ROCWMMA 2.0.0~~ HIP: Disable ROCWMMA fattn on CDNA when compiled against ROCWMMA 2.0.0 Sep 24, 2025

github-actions bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Sep 24, 2025

JohannesGaessler approved these changes Sep 24, 2025

View reviewed changes

ggml/src/ggml-cuda/fattn-wmma-f16.cuh Outdated Show resolved Hide resolved

IMbackK mentioned this pull request Sep 28, 2025

ci: Properly install rocwmma for hip builds #16305

Merged

IMbackK added 2 commits October 1, 2025 21:18

HIP: Disable ROCWMMA fatt on CDNA when compiled against ROCWMMA 2.0.0

d767dbe

rocwmma 2.0.0 includes a bug in the code fakeing fp16 accumulation on CDNA

CUDA: Fix volta condition in ggml_cuda_should_use_wmma_fattn

f7db90a

IMbackK force-pushed the rocwmmafix branch from b598bfc to f7db90a Compare October 1, 2025 19:22

slaren merged commit e95fec6 into ggml-org:master Oct 1, 2025
60 of 68 checks passed

Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 2, 2025

Revert "HIP: Disable ROCWMMA fattn on CDNA when compiled against ROCW…

3e1999a

…MMA 2.0.0 (ggml-org#16221)" This reverts commit e95fec6.

Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 2, 2025

Revert "HIP: Disable ROCWMMA fattn on CDNA when compiled against ROCW…

e80b596

…MMA 2.0.0 (ggml-org#16221)" This reverts commit e95fec6.

Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 2, 2025

Revert "HIP: Disable ROCWMMA fattn on CDNA when compiled against ROCW…

0d6b2b8

…MMA 2.0.0 (ggml-org#16221)" This reverts commit e95fec6.

Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 3, 2025

Revert "HIP: Disable ROCWMMA fattn on CDNA when compiled against ROCW…

11c11da

…MMA 2.0.0 (ggml-org#16221)" This reverts commit e95fec6.

Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 4, 2025

Revert "HIP: Disable ROCWMMA fattn on CDNA when compiled against ROCW…

9c3ed0a

…MMA 2.0.0 (ggml-org#16221)" This reverts commit e95fec6.

Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 5, 2025

Revert "HIP: Disable ROCWMMA fattn on CDNA when compiled against ROCW…

ba1dd2b

…MMA 2.0.0 (ggml-org#16221)" This reverts commit e95fec6.

Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 7, 2025

Revert "HIP: Disable ROCWMMA fattn on CDNA when compiled against ROCW…

0a902c5

…MMA 2.0.0 (ggml-org#16221)" This reverts commit e95fec6.

Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 7, 2025

Revert "HIP: Disable ROCWMMA fattn on CDNA when compiled against ROCW…

62e5817

…MMA 2.0.0 (ggml-org#16221)" This reverts commit e95fec6.

Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 9, 2025

Revert "HIP: Disable ROCWMMA fattn on CDNA when compiled against ROCW…

bee23d5

…MMA 2.0.0 (ggml-org#16221)" This reverts commit e95fec6.

Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 9, 2025

Revert "HIP: Disable ROCWMMA fattn on CDNA when compiled against ROCW…

e3fbbcb

…MMA 2.0.0 (ggml-org#16221)" This reverts commit e95fec6.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HIP: Disable ROCWMMA fattn on CDNA when compiled against ROCWMMA 2.0.0 #16221

HIP: Disable ROCWMMA fattn on CDNA when compiled against ROCWMMA 2.0.0 #16221

Uh oh!

IMbackK commented Sep 24, 2025

Uh oh!

IMbackK Sep 24, 2025

Uh oh!

IMbackK commented Sep 24, 2025

Uh oh!

IMbackK commented Sep 24, 2025

Uh oh!

IMbackK commented Sep 24, 2025 •

edited

Loading

Uh oh!

JohannesGaessler left a comment

Uh oh!

Uh oh!

IMbackK commented Sep 24, 2025

Uh oh!

IMbackK commented Sep 24, 2025

Uh oh!

deepsek commented Sep 24, 2025

Uh oh!

IMbackK commented Oct 1, 2025 •

edited

Loading

Uh oh!

slaren commented Oct 1, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

HIP: Disable ROCWMMA fattn on CDNA when compiled against ROCWMMA 2.0.0 #16221

HIP: Disable ROCWMMA fattn on CDNA when compiled against ROCWMMA 2.0.0 #16221

Uh oh!

Conversation

IMbackK commented Sep 24, 2025

Uh oh!

IMbackK Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

IMbackK commented Sep 24, 2025

Uh oh!

IMbackK commented Sep 24, 2025

Uh oh!

IMbackK commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JohannesGaessler left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

IMbackK commented Sep 24, 2025

Uh oh!

IMbackK commented Sep 24, 2025

Uh oh!

deepsek commented Sep 24, 2025

Uh oh!

IMbackK commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

slaren commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

IMbackK commented Sep 24, 2025 •

edited

Loading

IMbackK commented Oct 1, 2025 •

edited

Loading

slaren commented Oct 1, 2025 •

edited

Loading