-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Closed
Description
Hi :)
I noticed in the roadmap that embedding support is intended, and was wondering whether it includes llms such as mistral as well.
Specifically, e5_mistral has the added benefit of including only the adapter in the HF repo. so in this case we could deploy a single pod for both inference as well as truly SOTA embedding without added costs.
I assume it would be easier to implement since decoder only architectures are already supported.
I think e5_mistral the tweak should be to add a function LLMEngine that would take the last hidden state rather than sample on the output yes? if so, i could try and add the pr myself
please let me know if theres anything i could do to help.
alboimDor, Labmem009, webmalex, DavidPeleg6, iceychris and 4 more
Metadata
Metadata
Assignees
Labels
No labels