Skip to content

Bug: Last 2 Chunks In Streaming Mode Come Together In Firefox #9502

Closed
@CentricStorm

Description

@CentricStorm

What happened?

When using /completion with stream: true, the last 2 JSON chunks come together in Firefox, but Chrome seems to handle it fine, so it might be a Firefox bug.

Looking further into this, it seems like HTTP Transfer-Encoding: chunked requires each chunk to be terminated with \r\n, but here \n\n is used instead:

https://github.com/ggerganov/llama.cpp/blob/6262d13e0b2da91f230129a93a996609a2f5a2f2/examples/server/utils.hpp#L296-L299

This doesn't seem to be just a Windows requirement, but listed as part of the HTTP specification:
HTTP Chunked Transfer Coding

More information, including an example chunked response:
Transfer-Encoding Directives

Name and Version

llama-server.exe
version: 3761 (6262d13)
built with MSVC 19.29.30154.0 for x64

What operating system are you seeing the problem on?

Windows

Relevant log output

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bug-unconfirmedmedium severityUsed to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions