Skip to content

Nvidia provider failed in starter distro #3715

@wukaixingxp

Description

@wukaixingxp

System Info

h100 node with uv pip install llama-stack

Information

  • The official example scripts
  • My own modified scripts

🐛 Describe the bug

Setting NVIDIA_API_KEY=dummy NVIDIA_BASE_URL=http://localhost:8912 to use nvidia provider in starter distro but running into the error: 'NVIDIAInferenceAdapter' object has no attribute 'alias_to_provider_id_map'

Error logs

(venv) ~/kai$ NVIDIA_API_KEY=dummy NVIDIA_BASE_URL=http://localhost:8912 llama stack run /home/nvidia/.llama/distributions/starter/starter-run.yaml --image-type venv 
INFO     2025-10-06 23:25:38,532 llama_stack.core.utils.config_resolution:45 core: Using file path:
         /home/nvidia/.llama/distributions/starter/starter-run.yaml
INFO     2025-10-06 23:25:38,536 llama_stack.cli.stack.run:129 cli: Using run configuration:
         /home/nvidia/.llama/distributions/starter/starter-run.yaml
Using virtual environment: /home/nvidia/kai/venv
Virtual environment already activated
+ '[' -n /home/nvidia/.llama/distributions/starter/starter-run.yaml ']'
+ yaml_config_arg=/home/nvidia/.llama/distributions/starter/starter-run.yaml
+ python -m llama_stack.core.server.server /home/nvidia/.llama/distributions/starter/starter-run.yaml --port 8321
INFO     2025-10-06 23:25:40,250 llama_stack.core.utils.config_resolution:45 core: Using file path:
         /home/nvidia/.llama/distributions/starter/starter-run.yaml
INFO     2025-10-06 23:25:40,272 __main__:593 core::server: Run configuration:
INFO     2025-10-06 23:25:40,282 __main__:596 core::server: apis:
         - agents
         - batches
         - datasetio
         - eval
         - files
         - inference
         - post_training
         - safety
         - scoring
         - telemetry
         - tool_runtime
         - vector_io
         benchmarks: []
         datasets: []
         image_name: starter
         inference_store:
           db_path: /home/nvidia/.llama/distributions/starter/inference_store.db
           type: sqlite
         metadata_store:
           db_path: /home/nvidia/.llama/distributions/starter/registry.db
           type: sqlite
         models: []
         providers:
           agents:
           - config:
               persistence_store:
                 db_path: /home/nvidia/.llama/distributions/starter/agents_store.db
                 type: sqlite
               responses_store:
                 db_path: /home/nvidia/.llama/distributions/starter/responses_store.db
                 type: sqlite
             provider_id: meta-reference
             provider_type: inline::meta-reference
           batches:
           - config:
               kvstore:
                 db_path: /home/nvidia/.llama/distributions/starter/batches.db
                 type: sqlite
             provider_id: reference
             provider_type: inline::reference
           datasetio:
           - config:
               kvstore:
                 db_path: /home/nvidia/.llama/distributions/starter/huggingface_datasetio.db
                 type: sqlite
             provider_id: huggingface
             provider_type: remote::huggingface
           - config:
               kvstore:
                 db_path: /home/nvidia/.llama/distributions/starter/localfs_datasetio.db
                 type: sqlite
             provider_id: localfs
             provider_type: inline::localfs
           eval:
           - config:
               kvstore:
                 db_path: /home/nvidia/.llama/distributions/starter/meta_reference_eval.db
                 type: sqlite
             provider_id: meta-reference
             provider_type: inline::meta-reference
           files:
           - config:
               metadata_store:
                 db_path: /home/nvidia/.llama/distributions/starter/files_metadata.db
                 type: sqlite
               storage_dir: /home/nvidia/.llama/distributions/starter/files
             provider_id: meta-reference-files
             provider_type: inline::localfs
           inference:
           - config:
               api_key: '********'
               url: https://api.fireworks.ai/inference/v1
             provider_id: fireworks
             provider_type: remote::fireworks
           - config:
               api_key: '********'
               url: https://api.together.xyz/v1
             provider_id: together
             provider_type: remote::together
           - config: {}
             provider_id: bedrock
             provider_type: remote::bedrock
           - config:
               api_key: '********'
               append_api_version: true
               url: http://localhost:8912
             provider_id: nvidia
             provider_type: remote::nvidia
           - config:
               api_key: '********'
               base_url: https://api.openai.com/v1
             provider_id: openai
             provider_type: remote::openai
           - config:
               api_key: '********'
             provider_id: anthropic
             provider_type: remote::anthropic
           - config:
               api_key: '********'
             provider_id: gemini
             provider_type: remote::gemini
           - config:
               api_key: '********'
               url: https://api.groq.com
             provider_id: groq
             provider_type: remote::groq
           - config:
               api_key: '********'
               url: https://api.sambanova.ai/v1
             provider_id: sambanova
             provider_type: remote::sambanova
           - config: {}
             provider_id: sentence-transformers
             provider_type: inline::sentence-transformers
           post_training:
           - config:
               checkpoint_format: meta
             provider_id: torchtune-cpu
             provider_type: inline::torchtune-cpu
           safety:
           - config:
               excluded_categories: []
             provider_id: llama-guard
             provider_type: inline::llama-guard
           - config: {}
             provider_id: code-scanner
             provider_type: inline::code-scanner
           scoring:
           - config: {}
             provider_id: basic
             provider_type: inline::basic
           - config: {}
             provider_id: llm-as-judge
             provider_type: inline::llm-as-judge
           - config:
               openai_api_key: '********'
             provider_id: braintrust
             provider_type: inline::braintrust
           telemetry:
           - config:
               service_name: "\u200B"
               sinks: console,sqlite
               sqlite_db_path: /home/nvidia/.llama/distributions/starter/trace_store.db
             provider_id: meta-reference
             provider_type: inline::meta-reference
           tool_runtime:
           - config:
               api_key: '********'
               max_results: 3
             provider_id: brave-search
             provider_type: remote::brave-search
           - config:
               api_key: '********'
               max_results: 3
             provider_id: tavily-search
             provider_type: remote::tavily-search
           - config: {}
             provider_id: rag-runtime
             provider_type: inline::rag-runtime
           - config: {}
             provider_id: model-context-protocol
             provider_type: remote::model-context-protocol
           vector_io:
           - config:
               kvstore:
                 db_path: /home/nvidia/.llama/distributions/starter/faiss_store.db
                 type: sqlite
             provider_id: faiss
             provider_type: inline::faiss
           - config:
               db_path: /home/nvidia/.llama/distributions/starter/sqlite_vec.db
               kvstore:
                 db_path: /home/nvidia/.llama/distributions/starter/sqlite_vec_registry.db
                 type: sqlite
             provider_id: sqlite-vec
             provider_type: inline::sqlite-vec
         scoring_fns: []
         server:
           port: 8321
         shields: []
         tool_groups:
         - provider_id: tavily-search
           toolgroup_id: builtin::websearch
         - provider_id: rag-runtime
           toolgroup_id: builtin::rag
         vector_dbs: []
         version: 2

INFO     2025-10-06 23:25:40,940 llama_stack.providers.remote.inference.nvidia.nvidia:82 inference::nvidia: Initializing
         NVIDIAInferenceAdapter(http://localhost:8912)...
INFO     2025-10-06 23:25:42,658 llama_stack.providers.utils.inference.inference_store:74 inference_store: Write queue disabled for SQLite to avoid
         concurrency issues
WARNING  2025-10-06 23:25:43,214 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider fireworks: Pass
         Fireworks API Key in the header X-LlamaStack-Provider-Data as { "fireworks_api_key": <your api key>}
WARNING  2025-10-06 23:25:43,216 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider together: Pass
         Together API Key in the header X-LlamaStack-Provider-Data as { "together_api_key": <your api key>}
Handling connection for 8912
ERROR    2025-10-06 23:25:43,858 __main__:527 core::server: Error creating app: 'NVIDIAInferenceAdapter' object has no attribute
         'alias_to_provider_id_map'
++ error_handler 119
++ echo 'Error occurred in script at line: 119'
Error occurred in script at line: 119
++ exit 1

Expected behavior

llama stack server should be able to work with local NIM deployments, eg: NVIDIA_API_KEY=dummy NVIDIA_BASE_URL=http://localhost:8912

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions