-
Notifications
You must be signed in to change notification settings - Fork 12.8k
Closed
Labels
Description
In llama.cpp
we have logic for supporting some very old model formats and features such as sharded models which is making the code unnecessary complicated and difficult to maintain. We should simplify it and remove support for old stuff that is no longer used.
Additionally, with the upcoming unified file format (ggml-org/ggml#220) we will have to look into reimplementing the code to use it and add support for loading non-LLaMA models as well. This will be an important step towards adding inference of new models such as MPT and Falcon. Therefore, simplifying the logic as much as possible will help to easily adopt the new unified file format when it is ready
FNsi, lin72h, dillfrescott, clort81, mirek190 and 1 moreJchang4, lin72h, psugihara and qaziquzalin72h, parkma99 and qaziquza