-
Notifications
You must be signed in to change notification settings - Fork 1.2k
fix: fireworks provider chat completion failing #3422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs to also revert https://github.com/llamastack/llama-stack/pull/3392/files
@slekkala1 we're moving all the adapter we can to the OpenAIMixin, will you convert the fireworks adapter to use OpenAIMixin and fix usage for everyone? |
|
@mattf Yes I can do that, let me look into that instead then. Please link any related task you have I see another similar bug got reported #3420 for the |
#3387 has a list of other provider updates. i expect Fireworks can end up looking like https://github.com/llamastack/llama-stack/blob/main/llama_stack/providers/remote/inference/gemini/gemini.py however, don't bother updating the embedding/completion/chat_completion methods, they're on their way out #2365 |
this may help for testing -
|
Thanks for this, yes looking into some test failures for fireworks. Running into rate limit execeeded. Thinking of having a separate diff once I fix some of these tests. Will close this diff soon. |
What does this PR do?
Tried to fix earlier, however since fireworks api returns usage in response, this propagation can fix the end to end response to have the usage
Context in #3392
Closes #3391
Test Plan
Build-run server then test with curl