Skip to content

Consistent chat completion id in OpenAI compatible chat completion endpoint #5876

@xyc

Description

@xyc

First thank you for this great project!

I was wondering if for OpenAI compatible chat completion endpoint, the streaming responses should return the same completion id (chatcmpl-) for each chunk.

For example the chat completion ids are different (chatcmpl-2EHCQqsRzdOlFskNehCMu2oOMTXhSjey, chatcmpl-Cm7q0Ru5uEGVlW4r6cZaGNQrlS7oF724) in the following response:

{"choices":[{"delta":{"content":"Under"},"finish_reason":null,"index":0}],"created":1709591108,"id":"chatcmpl-2EHCQqsRzdOlFskNehCMu2oOMTXhSjey","model":"gpt-3.5-turbo","object":"chat.completion.chunk"}

{"choices":[{"delta":{"content":"stood"},"finish_reason":null,"index":0}],"created":1709591108,"id":"chatcmpl-Cm7q0Ru5uEGVlW4r6cZaGNQrlS7oF724","model":"gpt-3.5-turbo","object":"chat.completion.chunk"}

...

Example NodeJS code that generates the above chunks:

import OpenAI from "openai";

process.env["OPENAI_API_KEY"] =
  "no-key";

const openai = new OpenAI({
  baseURL: "http://127.0.0.1:8080/v1",
  apiKey: "no-key",
});

async function main() {
  const stream = await openai.chat.completions.create({
    model: "gpt-3.5-turbo",
    messages: [{ role: "user", content: "Say this is a test" }],
    stream: true,
  });
  for await (const chunk of stream) {
    process.stdout.write(JSON.stringify(chunk));
  }
}

main();

This is fine for nodejs server side generation, but if I stream the HTTP response and consume it with OpenAI's NodeJS SDK, I would get this error: missing finish_reason for choice 0. It seems that when a different chunk id is supplied here, #endRequest is called prematurely and the correspondent chunk would not have a finish_reason.

Example client side code:

import fetch from 'node-fetch';
import { ChatCompletionStream } from 'openai/lib/ChatCompletionStream';

fetch('http://localhost:3000', {
  method: 'POST',
  body: 'Tell me why dogs are better than cats',
  headers: { 'Content-Type': 'text/plain' },
}).then(async (res) => {
  // @ts-ignore ReadableStream on different environments can be strange
  const runner = ChatCompletionStream.fromReadableStream(res.body);

  runner.on('content', (delta, snapshot) => {
    process.stdout.write(delta);
    // or, in a browser, you might display like this:
    // document.body.innerText += delta; // or:
    // document.body.innerText = snapshot;
  });

  console.dir(await runner.finalChatCompletion(), { depth: null });
});

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggood first issueGood for newcomers

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions