-
Notifications
You must be signed in to change notification settings - Fork 11.8k
STILL no way to convert phi-3-small to GGUF #8241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It uses a different model architecture: Triton block sparse attention. It needs a lot of efforts. Is the work worth doing? I don't think so. Medium is better than Small, and Mini is faster than Small. |
I tested the small and it's not bad at all... BUT |
Regarding GLM-4, there is #8031. Or, you can try chatglm.cpp and chatllm.cpp. Tool calling is also supported. |
see that it WAS important? now also the new mini-128k-instruct does not convert! |
It can after #8262 merged or you change 'longrope' to 'su' in 'config.json'. |
yep. that did it. |
@foldl I'm pretty sure that change alone is not enough to make these models work past 4k, we need an actual longrope implementation which is not yet supported. The older 128k method from microsoft was added, but longrope isn't, that change just allows it to fall through and likely uses the wrong method and will result in a broken model |
@bartowski1182 I am sure |
Duplicate of #7922 and #6849. Please refer to #6849, #7705 or #8031 to contribute. Creating intentionally duplicate issues every few days is splitting the discussion across an unhelpful number of threads and making work more difficult. Please search for previously created issues before opening new ones. Closing this one as duplicate. Thank you. |
I did not do it intentionally. sorry. |
Why is that?
Phi-3 are the best models around at the moment (for their size).
The text was updated successfully, but these errors were encountered: