[Feature] Dynamic Model Loading and Model Endpoint in FastAPI

I'd like to propose a future feature I think would add useful flexibility for users of the `completions/embeddings` API . I'm suggesting the ability to dynamically load models based on calls to the `FastAPI` endpoint.

The concept is as follows:
- Have a predefined location for model files (e.g., a `models` folder within the project) and allow users to specify an additional model folder if needed.
- When the API starts, it checks the designated model folders and populates the available models dynamically.
- Users can query the available models through a `GET` request to the `/v1/engines` endpoint , which would return a list of models and their statuses.
- Users can then specify the desired model when making inference requests.

This dynamic model loading feature would align with the behavior of the OpenAI spec for models and model status. It would offer users the flexibility to easily choose and use different models without having make manual changes to the project or configs.

This is a suggestion for later, but I wanted to suggest it now so we can plan if we do decide to implement it.

Let me know your thoughts :)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Dynamic Model Loading and Model Endpoint in FastAPI #17

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Participants

[Feature] Dynamic Model Loading and Model Endpoint in FastAPI #17

Description

Activity

0xdevalias commented on Apr 11, 2023

jmtatsch commented on Apr 12, 2023

MillionthOdin16 commented on Apr 13, 2023

abetlen commented on Apr 13, 2023

docmeth02 commented on Apr 13, 2023

abetlen commented on Dec 22, 2023

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Participants

Issue actions