You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have the implementation ready, but I'm not sure if this is what we want. Use of an importance matrix does improve perplexity for all models I have tried. But on the other hand the "legacy" ggml quants Q4_0 and Q5_0 are never very good, but they are also never really bad (Q4_1 and Q5_1 have more erratic behavior, for some models being better than Q4_0/Q5_0 and for other models being worse). Hence, one may want to preserve them the way they are as a kind of reference.