Skip to content

GGUF conversion doesn't respect tokenizer config add_bos/eos_token setting #3966

@KerfuffleV2

Description

@KerfuffleV2

This causes problems with at least one model (Yi), see discussion here: 01-ai/Yi#5

The automatic BOS that gets prepended apparently confuses the model.

SpecialVocab in gguf.py already loads tokenizer_config.json (although only as a fallback currently). The main question is probably how to add it to the GGUF file - what key, etc.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions