Skip to content

Commit 5ef321d

Browse files
committed
increase graph_max_nodes for finetune
fix regression during finetune on Llama-3.2-1B-F32: GGML_ASSERT(cgraph->n_nodes < cgraph->size) failed git bisect applying the most recent finetune (SGD) change showed that d498af3 Georgi Gerganov 2025-07-18 14:31:15 +0300 graph : avoid huge warm-up graphs for MoE models (ggml-org#14753) which greatly decreased graph_max_nodes has been responsible for finetune failing on reasonably sized models for the past two months. partially reverting the decrease (maybe larger models still fail) note: env LLAMA_SET_ROWS=0 is needed also or else: GML_ASSERT(!node->view_src || node->op == GGML_OP_CPY || node->op == GGML_OP_VIEW || node->op == GGML_OP_RESHAPE || node->op == GGML_OP_PERMUTE || node->op == GGML_OP_TRANSPOSE) failed (the node->op in question is indeed a rows op) unfortunately a git revert on: 8a4280c Georgi Gerganov 2025-08-28 12:27:02 +0300 kv-cache : remove LLAMA_SET_ROWS checks (ggml-org#15505) is not straightforward, so this branch is behind that.
1 parent f97e4e7 commit 5ef321d

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

src/llama-context.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1338,7 +1338,7 @@ void llama_context::output_reorder() {
13381338
//
13391339

13401340
uint32_t llama_context::graph_max_nodes() const {
1341-
return std::max<uint32_t>(1024u, 8u*model.n_tensors());
1341+
return std::max<uint32_t>(4096u, 8u*model.n_tensors());
13421342
}
13431343

13441344
llm_graph_result * llama_context::get_gf_res_reserve() const {

0 commit comments

Comments
 (0)