We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
With each token processed the inference speed slows down a little bit, starts to become noticeable at around 50 tokens on 40B Q3_K and adds up.