Sentence-Transformers-finetuned `jinaai/jina-embeddings-v2-small-en` doesn't work

### System Info

I'm testing deployment on HF endpoint (specifically a single L4 machine from AWS)

### Information

- [ ] Docker
- [ ] The CLI directly

### Tasks

- [ ] An officially supported command
- [ ] My own modifications

### Reproduction

Deploying `jinaai/jina-embeddings-v2-small-en` on HF endpoint with TEI works fine.

Opening it in SentenceTransformers, saving it, then deploying it on HF endpoint with TEI doesn't work.

```python
from sentence_transformers import SentenceTransformer
model = SentenceTransformer(
    "jinaai/jina-embeddings-v2-small-en", trust_remote_code=True
)
model.push_to_hub("borgcollectivegmbh/testing-jina-stuff")
```

Deploying this model I pushed fails with 

```
[Server message]Endpoint failed to start
Exit code: 1. Reason: {"timestamp":"2025-04-03T23:18:41.962397Z","level":"INFO","message":"Args { model_id: \"/rep****ory\", revision: None, tokenization_workers: None, dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, auto_truncate: false, default_prompt_name: None, default_prompt: None, hf_api_token: None, hostname: \"r-borgcollectivegmbh-testing-jina-stuff-cvy-q6pyonpo-18c10-h6i8\", port: 80, uds_path: \"/tmp/text-embeddings-inference-server\", huggingface_hub_cache: Some(\"/repository/cache\"), payload_limit: 2000000, api_key: None, json_output: true, otlp_endpoint: None, otlp_service_name: \"text-embeddings-inference.server\", cors_allow_origin: None }","target":"text_embeddings_router","filename":"router/src/main.rs","line_number":175}
{"timestamp":"2025-04-03T23:18:41.971108Z","level":"INFO","message":"Maximum number of tokens per request: 8192","target":"text_embeddings_router","filename":"router/src/lib.rs","line_number":188}
{"timestamp":"2025-04-03T23:18:41.971307Z","level":"INFO","message":"Starting 7 tokenization workers","target":"text_embeddings_core::tokenization","filename":"core/src/tokenization.rs","line_number":28}
{"timestamp":"2025-04-03T23:18:41.997508Z","level":"INFO","message":"Starting model backend","target":"text_embeddings_router","filename":"router/src/lib.rs","line_number":230}
{"timestamp":"2025-04-03T23:18:42.397739Z","level":"INFO","message":"Starting FlashBert model on Cuda(CudaDevice(DeviceId(1)))","target":"text_embeddings_backend_candle","filename":"backends/candle/src/lib.rs","line_number":258}
{"timestamp":"2025-04-03T23:18:42.398017Z","level":"ERROR","message":"Could not start Candle backend: Could not start backend: FlashBert only supports absolute position embeddings","target":"text_embeddings_backend","filename":"backends/src/lib.rs","line_number":255}
Error: Could not create backend

Caused by:
    Could not start backend: Could not start a suitable backend
```

You can test yourself, the model I pushed above is public

### Expected behavior

That deployment works with `jinaai/jina-embeddings-v2-small-en` even after finetuning

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sentence-Transformers-finetuned `jinaai/jina-embeddings-v2-small-en` doesn't work #556

System Info

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Sentence-Transformers-finetuned jinaai/jina-embeddings-v2-small-en doesn't work #556

Description

System Info

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Sentence-Transformers-finetuned `jinaai/jina-embeddings-v2-small-en` doesn't work #556