E5-mistral-7b-instruct embedding support

Hi :)
I noticed in the [roadmap](https://github.com/vllm-project/vllm/issues/2681) that embedding support is intended, and was wondering whether it includes llms such as mistral as well.

Specifically, [e5_mistral](https://huggingface.co/intfloat/e5-mistral-7b-instruct#e5-mistral-7b-instruct) has the added benefit of including only the adapter in the HF repo. so in this case we could deploy a single pod for both inference as well as truly SOTA embedding without added costs.

I assume it would be easier to implement since decoder only architectures are already supported.
I think [e5_mistral](https://huggingface.co/intfloat/e5-mistral-7b-instruct#e5-mistral-7b-instruct) the tweak should be to add a function LLMEngine that would take the last hidden state rather than sample on the output yes? if so, i could try and add the pr myself

please let me know if theres anything i could do to help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

E5-mistral-7b-instruct embedding support #2936

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

E5-mistral-7b-instruct embedding support #2936

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions