Skip to content

Limit the available models list to only those explicitly enabled in config #3151

@bparees

Description

@bparees

🚀 Describe the new functionality needed

Today the /v1/models endpoint lists all the models discovered at the provider, as well as any models explicitly configured in the run.yaml. While this makes for an easy "getting started" experience, it makes it harder for client implementations that want to rely on backend infrastructure to tell them which models should be used based on corporate policy or other requirements.

I think it would be helpful to have a flag to enable/disable including the auto-discovered models in the models list api. I don't have an opinion on whether the default should be on or off, but defaulting it on would allow us to preserve existing behavior.

another option could be to introduce a second api, or a parameter on the existing api, to control whether you get "all the models" or "explicitly configured" models.

💡 Why is this needed? What if we don't build it?

Without this feature, any client that talks to a llamastack instance needs to have its own config+filtering logic to ensure it is only using the intended models and not other models that might be discovered at a provider. In some cases this might be sensible, but in other cases requiring every client implementation to reimplement that logic (and ensure those configurations are consistent across client deployments) is more effort than allowing an admin to configure the llamastack instance w/ the models that should be available.

And since provider/model configuration must be done in llamastack anyway (to supply credentials, endpoints, etc), splitting+duplicating the model/provider config between llamastack and clients feels like a worse experience than being able to control it all centrally in the llamastack config.

Other thoughts

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions