ggml : extend ggml_can_fuse to work with non-sequential nodes #16123

ggerganov · 2025-09-20T09:21:59Z

In some cases, we might want to fuse ops that are not sequential in the graph. For example, when there are intermediate view ops in-between the fusable ops.

Sample usage: #16102

…graph

jeffbolznv

Fwiw, with the sorting I have in the vulkan backend, I try to separate empty nodes from ninety nodes, which makes it less important to be able to skip.

jeffbolznv · 2025-09-20T11:37:18Z

ggml/src/ggml-impl.h

+static inline bool ggml_can_fuse_ext(const struct ggml_cgraph * cgraph, const int * node_idxs, const enum ggml_op * ops, int num_ops) {
    for (int i = 0; i < num_ops; ++i) {
-        struct ggml_tensor * node = cgraph->nodes[node_idx + i];
+        if (node_idxs[i] + num_ops > cgraph->n_nodes) {


Why add num_ops here? Also, should be >=, i think.

jeffbolznv · 2025-09-20T11:47:11Z

ggml/src/ggml-impl.h

    return ggml_can_fuse(cgraph, node_idx, ops.begin(), (int)ops.size());
 }

+inline bool ggml_can_fuse(const struct ggml_cgraph * cgraph, std::initializer_list<int> node_idx, std::initializer_list<enum ggml_op> ops) {


I'm not totally clear on how you foresee this function being used. I assume the caller needs to look through nodes and find the ops its interested in (skipping empty nodes), so I dont think it'll be able to use an initializer list for the node idxs.

The idea is to create a list of indices that excludes the empty nodes in idxs. And then use this like so:

// non-empty node indices std::vector<int> idxs; bool can_fuse(int i0, int i1, std::initializer_list<enum ggml_op> ops) const { assert(use_fusion); assert(i0 >= 0 && i0 < n_nodes()); assert(i1 >= 0 && i1 < n_nodes()); assert(ops.size() == 2); return ggml_can_fuse(gf, { idxs[i0], idxs[i1] }, ops); }

Btw, I really need an API to tell if 2 ops are fusable - don't think there is any benefit to support more than a pair of ops. The assumption is that if we can fuse N ops, then for sure we can fuse N-1 ops. So at step N we only need to check if ops N and N+1 are fusable - no need to iterate again over all previous ops.

If you have to rebuild the initializer_list from the vector, why not just pass an int* instead?

Right, I can do that. Removed the extra overload.

ggerganov · 2025-09-20T14:16:21Z

Fwiw, with the sorting I have in the vulkan backend, I try to separate empty nodes from ninety nodes, which makes it less important to be able to skip.

Yes, I noticed that works, but I'm not really confident if this won't break any assumption in the future. Specifically, this case:

c0 = ADD(a, b)
f0 = VIEW(...)
c1 = ADD(c0, f0)
f1 = VIEW(...)
c2 = ADD(c1, f1)
... etc

If I just move all the adds together and put the views after them, everything seems to work. But I think I also saw that depending on the order of the views, the allocr can do different things. For example, at some point I was thinking, let's make the graph_optimize simply put all empty nodes at the end of the graph first and then work with a truncated graph with fewer nodes. This also runs correctly, but I did notice a different allocation pattern of the nodes. Hence I'm having some doubts about the implications of changing the order of the views (or any other empty node).

@danbev

* origin/master: (39 commits) ci : disable AMD workflows + update NVIDIA workflows (ggml-org#16200) ci : enable Vulkan workflow on Mac (ggml-org#16194) ggml-cpu: Respect cpumask settings (ggml-org#16164) ggml : fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl (ggml-org#15928) zdnn: refactor codebase + add docs (ggml-org#16178) codeowners : add @danbev to model-conversion example [no ci] (ggml-org#16190) devops: add s390x containers (ggml-org#15915) ggml-cpu : fix typo in gemm comments [no ci] (ggml-org#16189) feat: Add conversion support in GraniteHybrid for non-hybrid (all attn) (ggml-org#16177) clang-tidy : disable warning about performance enum size (ggml-org#16127) ggml : implement set_rows with i32 index (ggml-org#16159) codeowners : update + cleanup (ggml-org#16174) common : enable `--offline` mode without curl support (ggml-org#16137) webui : fix handling incomplete chunks (ggml-org#16107) embedding : fix typos in README (ggml-org#16171) common : remove unused local variables (ggml-org#16140) ggml : extend ggml_can_fuse to work with non-sequential nodes (ggml-org#16123) ggml : add ggml_op_is_empty (ggml-org#16122) codeowners : update ownership for @ngxson and @allozuar (ggml-org#16128) Vulkan: add conv_transpose_2d operation (ggml-org#16022) ...

…rg#16123) * ggml : extend ggml_can_fuse to work with non-sequential nodes in the graph * cont : fix wrong bounds check condition * cont : remove unnecessary overload

ggml : extend ggml_can_fuse to work with non-sequential nodes in the …

c8f1ae5

…graph

ggerganov requested a review from jeffbolznv September 20, 2025 09:22

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Sep 20, 2025

jeffbolznv reviewed Sep 20, 2025

View reviewed changes

cont : fix wrong bounds check condition

300a50e

cont : remove unnecessary overload

714489f

jeffbolznv approved these changes Sep 21, 2025

View reviewed changes

ggerganov merged commit 4f324a5 into master Sep 22, 2025
60 of 68 checks passed

ggerganov deleted the gg/ggml-fuse-overload branch September 22, 2025 08:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ggml : extend ggml_can_fuse to work with non-sequential nodes #16123

ggml : extend ggml_can_fuse to work with non-sequential nodes #16123

Uh oh!

ggerganov commented Sep 20, 2025

Uh oh!

jeffbolznv left a comment

Uh oh!

jeffbolznv Sep 20, 2025

Uh oh!

jeffbolznv Sep 20, 2025

Uh oh!

ggerganov Sep 21, 2025

Uh oh!

jeffbolznv Sep 21, 2025

Uh oh!

ggerganov Sep 21, 2025

Uh oh!

ggerganov commented Sep 20, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

ggml : extend ggml_can_fuse to work with non-sequential nodes #16123

ggml : extend ggml_can_fuse to work with non-sequential nodes #16123

Uh oh!

Conversation

ggerganov commented Sep 20, 2025

Uh oh!

jeffbolznv left a comment

Choose a reason for hiding this comment

Uh oh!

jeffbolznv Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

jeffbolznv Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

ggerganov Sep 21, 2025

Choose a reason for hiding this comment

Uh oh!

jeffbolznv Sep 21, 2025

Choose a reason for hiding this comment

Uh oh!

ggerganov Sep 21, 2025

Choose a reason for hiding this comment

Uh oh!

ggerganov commented Sep 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ggerganov commented Sep 20, 2025 •

edited

Loading