Q8_0: unbreak AVX #1117

sw · 2023-04-22T08:09:56Z

#1109 was not finished for AVX (note: that affects all quantized formats, not just Q4_3 as the summary would suggest). This fixes it by introducing hsum_i32_4, in order to calculate s0 and s1.

ggerganov · 2023-04-22T08:15:01Z

I added commented flags to the Makefile that can be used to go in AVX-only mode for easier debugging in the future:

https://github.com/ggerganov/llama.cpp/blob/master/Makefile#L79-L83

Q8_0: unbreak AVX

7085407

sw closed this Apr 22, 2023

sw deleted the q8-avx branch April 22, 2023 08:11

Bearsaerker mentioned this pull request Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Q8_0: unbreak AVX #1117

Q8_0: unbreak AVX #1117

Uh oh!

sw commented Apr 22, 2023

Uh oh!

ggerganov commented Apr 22, 2023

Uh oh!

Uh oh!

Q8_0: unbreak AVX #1117

Q8_0: unbreak AVX #1117

Uh oh!

Conversation

sw commented Apr 22, 2023

Uh oh!

ggerganov commented Apr 22, 2023

Uh oh!

Uh oh!