Fireworks model chat completion broken with telemetry

### System Info

(llama-stack) (base) swapna942@swapna942-mac llama-stack % python -m "torch.utils.collect_env"
<frozen runpy>:128: RuntimeWarning: 'torch.utils.collect_env' found in sys.modules after import of package 'torch.utils', but prior to execution of 'torch.utils.collect_env'; this may result in unpredictable behaviour
Collecting environment information...
PyTorch version: 2.8.0
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 15.6.1 (arm64)
GCC version: Could not collect
Clang version: 17.0.0 (clang-1700.0.13.5)
CMake version: version 4.0.3
Libc version: N/A

Python version: 3.13.7 (main, Aug 14 2025, 11:12:11) [Clang 17.0.0 (clang-1700.0.13.3)] (64-bit runtime)
Python platform: macOS-15.6.1-arm64-arm-64bit-Mach-O
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Apple M4 Max

Versions of relevant libraries:
[pip3] Could not collect
[conda] numpy                                       2.3.1            pypi_0             pypi
[conda] torch                                       2.7.1            pypi_0             pypi

### Information

- [ ] The official example scripts
- [ ] My own modified scripts

### 🐛 Describe the bug

Steps to reproduce:
1. `uv run --with llama-stack llama stack build --distro starter --image-type venv --run`
2. Try `curl -X POST http://0.0.0.0:8321/v1/openai/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
      "model": "fireworks/accounts/fireworks/models/llama-v3p1-8b-instruct",
      "messages": [{"role": "user", "content": "Hello!"}]
    }'
{"detail":"Internal server error: An unexpected error occurred."}% ` and fails with 500



### Error logs

In server side logs we see 
```
INFO     2025-09-09 15:14:25,353 console_span_processor:28 telemetry: 22:14:25.353 [START] /v1/openai/v1/chat/completions                             
INFO     2025-09-09 15:14:25,359 console_span_processor:39 telemetry: 22:14:25.355 [END] ModelsRoutingTable.get_model [StatusCode.OK] (0.22ms)        
INFO     2025-09-09 15:14:25,360 console_span_processor:48 telemetry:     output: {'identifier':                                                      
         'fireworks/accounts/fireworks/models/llama-v3p1-8b-instruct', 'provider_resource_id': 'accounts/fireworks/models/llama-v3p1-8b-instruct',    
         'provider_id': 'fireworks', 'type': 'model', 'owner': None, 'source': 'listed_from_provider', 'metadata': {}, 'model_type': 'llm'}           
INFO     2025-09-09 15:14:25,362 console_span_processor:39 telemetry: 22:14:25.361 [END] ModelsRoutingTable.get_provider_impl [StatusCode.OK] (0.20ms)
INFO     2025-09-09 15:14:25,362 console_span_processor:48 telemetry:     output:                                                                     
         <llama_stack.providers.remote.inference.fireworks.fireworks.FireworksInferenceAdapter object at 0x1143e56a0>                                 
INFO     2025-09-09 15:14:25,364 console_span_processor:39 telemetry: 22:14:25.363 [END] ModelsRoutingTable.get_model [StatusCode.OK] (0.21ms)        
INFO     2025-09-09 15:14:25,365 console_span_processor:48 telemetry:     output: {'identifier':                                                      
         'fireworks/accounts/fireworks/models/llama-v3p1-8b-instruct', 'provider_resource_id': 'accounts/fireworks/models/llama-v3p1-8b-instruct',    
         'provider_id': 'fireworks', 'type': 'model', 'owner': None, 'source': 'listed_from_provider', 'metadata': {}, 'model_type': 'llm'}           
INFO     2025-09-09 15:14:25,367 console_span_processor:39 telemetry: 22:14:25.366 [END] ModelsRoutingTable.get_model [StatusCode.OK] (0.17ms)        
INFO     2025-09-09 15:14:25,367 console_span_processor:48 telemetry:     output: {'identifier':                                                      
         'fireworks/accounts/fireworks/models/llama-v3p1-8b-instruct', 'provider_resource_id': 'accounts/fireworks/models/llama-v3p1-8b-instruct',    
         'provider_id': 'fireworks', 'type': 'model', 'owner': None, 'source': 'listed_from_provider', 'metadata': {}, 'model_type': 'llm'}           
ERROR    2025-09-09 15:14:25,634 __main__:257 core::server: Error executing endpoint route='/v1/openai/v1/chat/completions' method='post':            
         'OpenAIChatCompletion' object has no attribute 'usage'                                                                                       
INFO     2025-09-09 15:14:25,635 uvicorn.access:473 uncategorized: 127.0.0.1:65526 - "POST /v1/openai/v1/chat/completions HTTP/1.1" 500               
INFO     2025-09-09 15:14:25,639 console_span_processor:39 telemetry: 22:14:25.636 [END] FireworksInferenceAdapter.chat_completion [StatusCode.OK]    
         (270.81ms)                                                                                                                                   
INFO     2025-09-09 15:14:25,640 console_span_processor:48 telemetry:     output: {'metrics': None, 'completion_message': {'role': 'assistant',       
         'content': 'Hello! How can I assist you today?', 'stop_reason': 'end_of_turn', 'tool_calls': []}, 'logprobs': None}                          
INFO     2025-09-09 15:14:25,642 console_span_processor:39 telemetry: 22:14:25.641 [END] FireworksInferenceAdapter.openai_chat_completion             
         [StatusCode.OK] (277.90ms)                                                                                                                   
INFO     2025-09-09 15:14:25,643 console_span_processor:48 telemetry:     output: {'id': 'chatcmpl-8bfeb3b1-9a09-468f-9347-d55f1debe3b7', 'choices':  
         [{'message': {'role': 'assistant', 'content': 'Hello! How can I assist you today?', 'name': None, 'tool_calls': None}, 'finish_reason':      
         'stop', 'index': 0, 'logprobs': None}], 'object': 'chat.completion', 'created': 1757456065, 'model':                                         
         'fireworks/accounts/fireworks/models/llama-v3p1-8b-instruct'}                                                                                
INFO     2025-09-09 15:14:25,645 console_span_processor:39 telemetry: 22:14:25.643 [END] InferenceRouter.openai_chat_completion [StatusCode.OK]       
         (289.42ms)                                                                                                                                   
INFO     2025-09-09 15:14:25,646 console_span_processor:48 telemetry:     error: 'OpenAIChatCompletion' object has no attribute 'usage'               
INFO     2025-09-09 15:14:25,648 console_span_processor:39 telemetry: 22:14:25.647 [END] /v1/openai/v1/chat/completions [StatusCode.OK] (293.99ms)    
INFO     2025-09-09 15:14:25,649 console_span_processor:48 telemetry:     raw_path: /v1/openai/v1/chat/completions       
```

### Expected behavior

Shouldnt fail and chat completion need to work, telemetry may not work is ok

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fireworks model chat completion broken with telemetry #3391

System Info

Information

🐛 Describe the bug

Error logs

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Fireworks model chat completion broken with telemetry #3391

Description

System Info

Information

🐛 Describe the bug

Error logs

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions