fix: fireworks provider chat completion failing #3422

slekkala1 · 2025-09-11T23:12:25Z

What does this PR do?

Tried to fix earlier, however since fireworks api returns usage in response, this propagation can fix the end to end response to have the usage
Context in #3392
Closes #3391

Test Plan

Build-run server then test with curl

(llama-stack) (base) swapna942@swapna942-mac llama-stack % curl -X POST http://localhost:8321/v1/openai/v1/chat/completions \
      -H "Content-Type: application/json" \
      -H "X-LlamaStack-Provider-Data: {\"fireworks_api_key\": \"$FIREWORKS_API_KEY\"}" \
      -d '{
        "model": "fireworks/accounts/fireworks/models/llama-v3p1-8b-instruct",
        "messages": [{"role": "user", "content": "Hello!"}]
      }' | jq
{
  "metrics": [
    {
      "metric": "prompt_tokens",
      "value": 35,
      "unit": "tokens"
    },
    {
      "metric": "completion_tokens",
      "value": 10,
      "unit": "tokens"
    },
    {
      "metric": "total_tokens",
      "value": 45,
      "unit": "tokens"
    }
  ],
  "id": "chatcmpl-0646bc72-cf4a-4afd-8fa0-030275b452f6",
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "Hello! How can I assist you today?",
        "name": null,
        "tool_calls": null
      },
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null
    }
  ],
  "object": "chat.completion",
  "created": 1757632496,
  "model": "fireworks/accounts/fireworks/models/llama-v3p1-8b-instruct",
  "usage": {
    "prompt_tokens": 35,
    "completion_tokens": 10,
    "total_tokens": 45
  }
}

mattf

this needs to also revert https://github.com/llamastack/llama-stack/pull/3392/files

mattf · 2025-09-15T16:50:34Z

@slekkala1 we're moving all the adapter we can to the OpenAIMixin, will you convert the fireworks adapter to use OpenAIMixin and fix usage for everyone?

slekkala1 · 2025-09-15T17:04:32Z

this needs to also revert https://github.com/llamastack/llama-stack/pull/3392/files

We revert this already @mattf #3402

slekkala1 · 2025-09-15T17:07:40Z

@slekkala1 we're moving all the adapter we can to the OpenAIMixin, will you convert the fireworks adapter to use OpenAIMixin and fix usage for everyone?

@mattf Yes I can do that, let me look into that instead then. Please link any related task you have

I see another similar bug got reported #3420 for the response.usage

mattf · 2025-09-15T17:42:54Z

@slekkala1 we're moving all the adapter we can to the OpenAIMixin, will you convert the fireworks adapter to use OpenAIMixin and fix usage for everyone?

@mattf Yes I can do that, let me look into that instead then. Please link any related task you have

I see another similar bug got reported #3420 for the response.usage

#3387 has a list of other provider updates.

i expect Fireworks can end up looking like https://github.com/llamastack/llama-stack/blob/main/llama_stack/providers/remote/inference/gemini/gemini.py

however, don't bother updating the embedding/completion/chat_completion methods, they're on their way out #2365

mattf · 2025-09-16T10:02:50Z

this may help for testing -

diff --git a/tests/integration/suites.py b/tests/integration/suites.py
index bacd7ef5..4e653605 100644
--- a/tests/integration/suites.py
+++ b/tests/integration/suites.py
@@ -90,6 +90,15 @@ SETUP_DEFINITIONS: dict[str, Setup] = {
             "embedding_model": "sentence-transformers/all-MiniLM-L6-v2",
         },
     ),
+    "fireworks": Setup(
+        name="fireworks",
+        description="Fireworks provider with a text model",
+        defaults={
+            "text_model": "accounts/fireworks/models/llama-v3p2-3b-instruct",
+            "vision_model": "accounts/fireworks/models/llama-v3p2-11b-vision-instruct",
+            "embedding_model": "accounts/fireworks/models/qwen3-embedding-8b",
+        },
+    ),
 }

slekkala1 · 2025-09-17T16:35:24Z

this may help for testing -

diff --git a/tests/integration/suites.py b/tests/integration/suites.py
index bacd7ef5..4e653605 100644
--- a/tests/integration/suites.py
+++ b/tests/integration/suites.py
@@ -90,6 +90,15 @@ SETUP_DEFINITIONS: dict[str, Setup] = {
             "embedding_model": "sentence-transformers/all-MiniLM-L6-v2",
         },
     ),
+    "fireworks": Setup(
+        name="fireworks",
+        description="Fireworks provider with a text model",
+        defaults={
+            "text_model": "accounts/fireworks/models/llama-v3p2-3b-instruct",
+            "vision_model": "accounts/fireworks/models/llama-v3p2-11b-vision-instruct",
+            "embedding_model": "accounts/fireworks/models/qwen3-embedding-8b",
+        },
+    ),
 }

Thanks for this, yes looking into some test failures for fireworks. Running into rate limit execeeded.

Thinking of having a separate diff once I fix some of these tests. Will close this diff soon.

slekkala1 · 2025-09-18T17:32:41Z

@mattf Closing this diff, Opened new draft at #3480 with test and OpenAIMixin refactoring.
However, since firework provider uses OpenAIChatCompletionToLlamaStackMixin for llama_models, we continue fix this bug with response.usage in openai_compat.py

slekkala1 added 2 commits September 11, 2025 15:58

test-fireworks-fix

f9348a6

minor fix

4271f73

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 11, 2025

mattf requested changes Sep 15, 2025

View reviewed changes

mattf mentioned this pull request Sep 16, 2025

Standardize Inference Providers to Use OpenAIMixin #3387

Open

slekkala1 closed this Sep 18, 2025

cdoern mentioned this pull request Oct 6, 2025

fix: Update watsonx.ai provider to use LiteLLM mixin and list all models #3674

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: fireworks provider chat completion failing #3422

fix: fireworks provider chat completion failing #3422

Uh oh!

slekkala1 commented Sep 11, 2025 •

edited

Loading

Uh oh!

mattf left a comment

Uh oh!

mattf commented Sep 15, 2025

Uh oh!

slekkala1 commented Sep 15, 2025

Uh oh!

slekkala1 commented Sep 15, 2025 •

edited

Loading

Uh oh!

mattf commented Sep 15, 2025

Uh oh!

mattf commented Sep 16, 2025 •

edited by slekkala1

Loading

Uh oh!

slekkala1 commented Sep 17, 2025

Uh oh!

slekkala1 commented Sep 18, 2025

Uh oh!

Uh oh!

fix: fireworks provider chat completion failing #3422

fix: fireworks provider chat completion failing #3422

Uh oh!

Conversation

slekkala1 commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Test Plan

Uh oh!

mattf left a comment

Choose a reason for hiding this comment

Uh oh!

mattf commented Sep 15, 2025

Uh oh!

slekkala1 commented Sep 15, 2025

Uh oh!

slekkala1 commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattf commented Sep 15, 2025

Uh oh!

mattf commented Sep 16, 2025 • edited by slekkala1 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

slekkala1 commented Sep 17, 2025

Uh oh!

slekkala1 commented Sep 18, 2025

Uh oh!

Uh oh!

slekkala1 commented Sep 11, 2025 •

edited

Loading

slekkala1 commented Sep 15, 2025 •

edited

Loading

mattf commented Sep 16, 2025 •

edited by slekkala1

Loading