model-conversion : add --embeddings flag to modelcard.template [no ci] #15801

danbev · 2025-09-04T16:32:02Z

This commit updates the modelcard.template file used in the model conversion scripts for embedding models to include the llama-server --embeddings flag in the recommended command to run the model.

The motivation for this change was that when using the model-conversion "tool" to upload the EmbeddingGemma models to Hugging Face this flag was missing and the embedding endpoint was there for not available when copy&pasting the command.

This commit updates the modelcard.template file used in the model conversion scripts for embedding models to include the llama-server --embeddings flag in the recommended command to run the model. The motivation for this change was that when using the model-conversion "tool" to upload the EmbeddingGemma models to Hugging Face this flag was missing and the embedding endpoint was there for not available when copy&pasting the command.

…g-model-disabled-agent-prefill * origin/master: (84 commits) CUDA: fastdiv, launch bounds for mmvq + q8_1 quant (ggml-org#15802) tests : add --list-ops and --show-coverage options (ggml-org#15745) gguf: gguf_writer refactor (ggml-org#15691) kv-cache : fix SWA checks + disable cacheless iSWA (ggml-org#15811) model-conversion : add --embeddings flag to modelcard.template [no ci] (ggml-org#15801) chat : fixed crash when Hermes 2 <tool_call> had a newline before it (ggml-org#15639) chat : nemotron thinking & toolcalling support (ggml-org#15676) scripts : add Jinja tester PySide6 simple app (ggml-org#15756) llama : add support for EmbeddingGemma 300m (ggml-org#15798) metal : Add template specialization for mul_mm_id w/ ne20 == 10 (ggml-org#15799) llama : set n_outputs to 1 to avoid 0 outputs mean-pooling (ggml-org#15791) CANN: Refactor ND to NZ workspace to be per-device (ggml-org#15763) server: add exceed_context_size_error type (ggml-org#15780) Document the new max GPU layers default in help (ggml-org#15771) ggml: add ops for WAN video model (cuda && cpu) (ggml-org#15669) CANN: Fix precision issue on 310I DUO multi-devices (ggml-org#15784) opencl: add hs=40 to FA (ggml-org#15758) CANN: fix acl_rstd allocation size in ggml_cann_rms_norm (ggml-org#15760) vulkan: fix mmv subgroup16 selection (ggml-org#15775) vulkan: don't use std::string in load_shaders, to improve compile time (ggml-org#15724) ...

…upport * origin/master: Thinking model disabled assistant prefill (ggml-org#15404) Implement --log-colors with always/never/auto (ggml-org#15792) CUDA: fastdiv, launch bounds for mmvq + q8_1 quant (ggml-org#15802) tests : add --list-ops and --show-coverage options (ggml-org#15745) gguf: gguf_writer refactor (ggml-org#15691) kv-cache : fix SWA checks + disable cacheless iSWA (ggml-org#15811) model-conversion : add --embeddings flag to modelcard.template [no ci] (ggml-org#15801) chat : fixed crash when Hermes 2 <tool_call> had a newline before it (ggml-org#15639) chat : nemotron thinking & toolcalling support (ggml-org#15676) scripts : add Jinja tester PySide6 simple app (ggml-org#15756) llama : add support for EmbeddingGemma 300m (ggml-org#15798)

ggml-org#15801) This commit updates the modelcard.template file used in the model conversion scripts for embedding models to include the llama-server --embeddings flag in the recommended command to run the model. The motivation for this change was that when using the model-conversion "tool" to upload the EmbeddingGemma models to Hugging Face this flag was missing and the embedding endpoint was there for not available when copy&pasting the command.

ggerganov approved these changes Sep 4, 2025

View reviewed changes

github-actions bot added the examples label Sep 4, 2025

danbev merged commit 5d6688d into ggml-org:master Sep 5, 2025
1 check passed

danbev deleted the model-conversion-embedding-modelcard branch September 6, 2025 04:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

model-conversion : add --embeddings flag to modelcard.template [no ci] #15801

model-conversion : add --embeddings flag to modelcard.template [no ci] #15801

Uh oh!

danbev commented Sep 4, 2025

Uh oh!

Uh oh!

Uh oh!

model-conversion : add --embeddings flag to modelcard.template [no ci] #15801

model-conversion : add --embeddings flag to modelcard.template [no ci] #15801

Uh oh!

Conversation

danbev commented Sep 4, 2025

Uh oh!

Uh oh!

Uh oh!