Skip to content

STILL no way to convert phi-3-small to GGUF #8241

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
0wwafa opened this issue Jul 1, 2024 · 10 comments
Closed

STILL no way to convert phi-3-small to GGUF #8241

0wwafa opened this issue Jul 1, 2024 · 10 comments
Labels
duplicate This issue or pull request already exists

Comments

@0wwafa
Copy link

0wwafa commented Jul 1, 2024

Why is that?
Phi-3 are the best models around at the moment (for their size).

@foldl
Copy link
Contributor

foldl commented Jul 2, 2024

It uses a different model architecture: Triton block sparse attention.

It needs a lot of efforts. Is the work worth doing? I don't think so. Medium is better than Small, and Mini is faster than Small.

@0wwafa
Copy link
Author

0wwafa commented Jul 2, 2024

@foldl

It uses a different model architecture: Triton block sparse attention.

It needs a lot of efforts. Is the work worth doing? I don't think so. Medium is better than Small, and Mini is faster than Small.

I tested the small and it's not bad at all...
sincerely, perhaps implementing GLM-4 would be more important than this.
and the guys behind GLM-4 used a modified version of llama.cpp so it should not be difficult to port.

BUT
the PHI-3 family is the best I have seen so far. and won on the leaderboard against models twice it's size... would be interesting to test also the SMALL.

@foldl
Copy link
Contributor

foldl commented Jul 2, 2024

Regarding GLM-4, there is #8031.

Or, you can try chatglm.cpp and chatllm.cpp. Tool calling is also supported.

@0wwafa
Copy link
Author

0wwafa commented Jul 2, 2024

see that it WAS important? now also the new mini-128k-instruct does not convert!

@foldl
Copy link
Contributor

foldl commented Jul 3, 2024

It can after #8262 merged or you change 'longrope' to 'su' in 'config.json'.

@0wwafa
Copy link
Author

0wwafa commented Jul 3, 2024

change 'longrope' to 'su' in 'config.json'.

yep. that did it.

@bartowski1182
Copy link
Contributor

@foldl I'm pretty sure that change alone is not enough to make these models work past 4k, we need an actual longrope implementation which is not yet supported. The older 128k method from microsoft was added, but longrope isn't, that change just allows it to fall through and likely uses the wrong method and will result in a broken model

@foldl
Copy link
Contributor

foldl commented Jul 3, 2024

@bartowski1182 I am sure Phi3LongRoPEScaledRotaryEmbedding is just renamed from Phi3SuScaledRotaryEmbedding, only a new name and nothing else. I am not sure about the status of Phi3SuScaledRotaryEmbedding in llama.cpp. If supported, then, June 2024 Update will just work too.

@HanClinto
Copy link
Collaborator

HanClinto commented Jul 3, 2024

Duplicate of #7922 and #6849. Please refer to #6849, #7705 or #8031 to contribute.

Creating intentionally duplicate issues every few days is splitting the discussion across an unhelpful number of threads and making work more difficult. Please search for previously created issues before opening new ones.

Closing this one as duplicate.

Thank you.

@HanClinto HanClinto closed this as not planned Won't fix, can't repro, duplicate, stale Jul 3, 2024
@HanClinto HanClinto added the duplicate This issue or pull request already exists label Jul 3, 2024
@0wwafa
Copy link
Author

0wwafa commented Jul 3, 2024

Duplicate of #7922 and #6849. Please refer to #6849, #7705 or #8031 to contribute.

Creating intentionally duplicate issues every few days is splitting the discussion across an unhelpful number of threads and making work more difficult. Please search for previously created issues before opening new ones.

Closing this one as duplicate.

Thank you.

I did not do it intentionally. sorry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

4 participants