Skip to content

Conversation

relent95
Copy link
Contributor

@relent95 relent95 commented Sep 16, 2025

This PR adds conv_transpose_2d operation to Vulkan backend. The code are based on the implementation of the existing conv_2d operation. The shader supports strides (s0, s1), paddings (p0, p1) and dilations (d0, d1). But in ggml_vk_conv_transpose_2d(), they are constrained as s1 = s0, p0 = p1 = 0, d0 = d1 = 1, because of the existing GGML_OP_CONV_TRANSPOSE_2D interface.

@relent95 relent95 requested a review from 0cc4m as a code owner September 16, 2025 07:29
@github-actions github-actions bot added testing Everything test related Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Sep 16, 2025
@0cc4m
Copy link
Collaborator

0cc4m commented Sep 16, 2025

If this is based on the existing conv_2d shader, shouldn't it be possible to use the existing .comp file and only change the parts that are actually different with a preprocessor variable?

@relent95 relent95 requested a review from jeffbolznv September 17, 2025 04:33
@etasnadi
Copy link
Contributor

etasnadi commented Sep 18, 2025

If this is based on the existing conv_2d shader, shouldn't it be possible to use the existing .comp file and only change the parts that are actually different with a preprocessor variable?

You are right, if there will be different shaders with code repeats for each conv variant (transpose, gradient, channel-wise), the maintenance will be really though.

Also, there are still many ways left to optimize the conv kernel and I also have a few updates in my private repo that I might polish in the near future and submit.

@0cc4m
Copy link
Collaborator

0cc4m commented Sep 18, 2025

@etasnadi Yeah, the PR has already been updated to do that. You could also do a review if you want, you know the shader better than me. I'll mostly make sure that the C++ code is fine and the shader passes on Intel, AMD and Nvidia.

@etasnadi
Copy link
Contributor

etasnadi commented Sep 18, 2025

@etasnadi Yeah, the PR has already been updated to do that. You could also do a review if you want, you know the shader better than me. I'll mostly make sure that the C++ code is fine and the shader passes on Intel, AMD and Nvidia.

Yes, I've just realized that it was updated since.

Now my problem is that the kernel is and will be too complicated (it was really complicated before too), so we might need to introduce some abstraction I guess. Maybe @jeffbolznv has some ideas.

What do you think about adding support for HLSL shaders? As far as I know thae glslangValidatior already has basic support but in the meantime we could use dxc.

@0cc4m
Copy link
Collaborator

0cc4m commented Sep 18, 2025

What do you think about adding support for HLSL shaders? As far as I know thae glslangValidatior already has basic support but in the meantime we could use dxc.

What's the advantage of HLSL over GLSL? I'm not familiar with it, and not a fan of Microsoft dependencies. It would probably make maintenance harder. If you want to look into a shader language with more modern features, wouldn't slang be more interesting and more open?

Personally I'm hoping one of the projects looking into a C++-based compute shader syntax (similar to CUDA and ROCm) pans out. For now GLSL is good enough for me.

@etasnadi
Copy link
Contributor

etasnadi commented Sep 18, 2025

What do you think about adding support for HLSL shaders? As far as I know thae glslangValidatior already has basic support but in the meantime we could use dxc.

What's the advantage of HLSL over GLSL? I'm not familiar with it, and not a fan of Microsoft dependencies. It would probably make maintenance harder. If you want to look into a shader language with more modern features, wouldn't slang be more interesting and more open?

Personally I'm hoping one of the projects looking into a C++-based compute shader syntax (similar to CUDA and ROCm) pans out. For now GLSL is good enough for me.

For example, it seems that it supports templates: https://devblogs.microsoft.com/directx/announcing-hlsl-2021/#template-functions-and-data-types - and templates alone would help a lot. I am not a fan of adding dependencies to projects governed by a single company either, but this kernel will be unmaintainable in the future at this abstraction level and GLSL have limited features to deal with the problem.

Sglang is also a good idea, however I do not know how much it is adopted and if it is mature enough. For example I tried to use coopmats with slang without success ~1 year ago - I guess their compiler does not support all extensions automatically?

@etasnadi
Copy link
Contributor

etasnadi commented Sep 18, 2025

What do you think about adding support for HLSL shaders? As far as I know thae glslangValidatior already has basic support but in the meantime we could use dxc.

What's the advantage of HLSL over GLSL? I'm not familiar with it, and not a fan of Microsoft dependencies. It would probably make maintenance harder. If you want to look into a shader language with more modern features, wouldn't slang be more interesting and more open?

Personally I'm hoping one of the projects looking into a C++-based compute shader syntax (similar to CUDA and ROCm) pans out. For now GLSL is good enough for me.

Now I see that they don't have coopmat support but it's WIP. shader-slang/slang#7634 So I believe it is a good idea to add support for slang in the near future! Also, there are several NV suffixed accounts contributing to the project so I guess Nvidia has a bet on the project.

@0cc4m
Copy link
Collaborator

0cc4m commented Sep 18, 2025

@etasnadi I think it's Khronos, not Nvidia specifically, but yeah. Coopmat should already be there, see shader-slang/slang#7170 (comment), but let's not sidetrack this PR. If you want to look into it, go ahead. If discussion is needed, please open an issue about it.

@jeffbolznv
Copy link
Collaborator

HLSL doesn't support spec constants which IMO is a deal breaker. It also only has coopmat1 level of support for use in Vulkan. slang supports coopmat2, spec constants, and generics, and there are cases where generics would be helpful.

Copy link
Collaborator

@0cc4m 0cc4m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code runs correctly on my hardware. Looks good, we just need to resolve the last few comments.

@relent95 relent95 requested a review from 0cc4m September 21, 2025 12:56
Copy link
Collaborator

@0cc4m 0cc4m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@0cc4m 0cc4m merged commit 96fdca0 into ggml-org:master Sep 22, 2025
49 of 53 checks passed
gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request Sep 23, 2025
* origin/master: (39 commits)
ci : disable AMD workflows + update NVIDIA workflows (ggml-org#16200)
ci : enable Vulkan workflow on Mac (ggml-org#16194)
ggml-cpu: Respect cpumask settings (ggml-org#16164)
ggml : fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl (ggml-org#15928)
zdnn: refactor codebase + add docs (ggml-org#16178)
codeowners : add @danbev to model-conversion example [no ci] (ggml-org#16190)
devops: add s390x containers (ggml-org#15915)
ggml-cpu : fix typo in gemm comments [no ci] (ggml-org#16189)
feat: Add conversion support in GraniteHybrid for non-hybrid (all attn) (ggml-org#16177)
clang-tidy : disable warning about performance enum size (ggml-org#16127)
ggml : implement set_rows with i32 index (ggml-org#16159)
codeowners : update + cleanup (ggml-org#16174)
common : enable `--offline` mode without curl support (ggml-org#16137)
webui : fix handling incomplete chunks (ggml-org#16107)
embedding : fix typos in README (ggml-org#16171)
common : remove unused local variables (ggml-org#16140)
ggml : extend ggml_can_fuse to work with non-sequential nodes (ggml-org#16123)
ggml : add ggml_op_is_empty (ggml-org#16122)
codeowners : update ownership for @ngxson and @allozuar (ggml-org#16128)
Vulkan: add conv_transpose_2d operation (ggml-org#16022)
...
struct pushed a commit to struct/llama.cpp that referenced this pull request Sep 26, 2025
* Vulkan: add conv_transpose_2d operation

* Vulkan: fix typo in conv_transpose_2d shader(s0mp, s0L, s1mp, s1L)

* Vulkan: fix incorrect indentation in conv_transpose_2d shader

* Vulkan: add checking the push constants size limit and reuse conv2d_mm.comp for conv_transpose_2d operation

* Vulkan: revert the order of the index calculation and bound check in conv_2d shader

* Vulkan: explicity check push constants limit in supports_op() for conv_transpose_2d operation.

* Vulkan: remove unnecessary lower bound checks for H/W_idx in the conv_2d shader.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning testing Everything test related Vulkan Issues specific to the Vulkan backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants