-
Notifications
You must be signed in to change notification settings - Fork 12.9k
Open
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomershelp wantedNeeds help from the communityNeeds help from the communitysplitGGUF split model shardingGGUF split model sharding
Description
Context
At the moment it is only possible to split after convertion or quantization. Mentionned by @Artefact2 in this [comment](https://github.com/ggerganov/llama.cpp/pull/6135#issuecomment-2003942162)
:
as an alternative, add the splitting logic directly to tools that produce ggufs, like convert.py and quantize.
Proposition
Include split options in convert*.py
, support splits in quantize
Artefact2 and lin72hlin72h
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomershelp wantedNeeds help from the communityNeeds help from the communitysplitGGUF split model shardingGGUF split model sharding