Skip to content

Conversation

rpanackal
Copy link
Member

@rpanackal rpanackal commented Aug 29, 2025

Context

AI/ai-sdk-java-backlog#313.

We would like to introduce convenience for embedding in orchestration module.

Feature scope:

  • New convenience classes embedding model, request and response convenience
    • OrchestrationEmbeddingModel, OrchestrationEmbeddingRequest and OrchestrationEmbeddingResponse
    • Embedding models added as per SAP Notes. All deprecated models skipped.
    • Scenario test call returns inconsistent report on models compared to SAP Notes (ignored)
  • Add additional client method OrchestrationClient#embed(OrchestrationEmbeddingRequest)
  • E2E Embedding endpoint with masking configured
  • Support for float encoding only.
  • Return embedding as List<float[]>. API parity with OpenAI
    • Optional: Generate model class with embedding as type List<float[]>. Setting generator config <useFloatArrays>true</useFloatArrays> seems ineffective. Needs investigation.

Definition of Done

- client method for high level objects
- Only support 'float' encoding format
- Add javadoc
- create convenience embedding response class
- Add unit/integration tests
- Add json payloads for req and res
- Add e2e
@rpanackal rpanackal self-assigned this Aug 29, 2025
@rpanackal rpanackal added the please-review Request to review a pull-request label Sep 1, 2025
Comment on lines +38 to +42
final var bigDecimals = (Embedding.InnerBigDecimals) container.getEmbedding();
final var values = bigDecimals.values();
final float[] arr = new float[values.size()];
for (int i = 0; i < values.size(); i++) {
arr[i] = values.get(i).floatValue();
Copy link
Member Author

@rpanackal rpanackal Sep 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an unfortunate consequence of the openapi generator flag <useFloatArrays>true</useFloatArrays> not being supported for

oneOf:                        # works without the `oneOf`
  - type: array
     items:
       type: integer

Copy link
Contributor

@newtork newtork Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is concerning:

See memory footprint comparison

Memory Usage Breakdown

float[] array:

  • Each float uses exactly 4 bytes
  • Array overhead: ~12-16 bytes (object header)
  • Total for 1000 elements: ~4,012 bytes

List<BigDecimal> (assuming ArrayList):

  • ArrayList overhead: ~24 bytes + internal array overhead
  • Each BigDecimal object: ~40-48 bytes (object header + BigInteger + scale + precision)
  • Each BigInteger inside: ~24-32 bytes + int array for digits
  • Boxing overhead from list storage
  • Total for 1000 elements: ~80,000-100,000 bytes

  • Is this limitation originating from the openapi generator or our creators feature?
  • Do you think it's possible to fix it in our creators feature?
  • Is a follow-up BLI already considered?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • openapi generator donot support this at all
  • our creator feature doesn't account for the combined use of USE_FLOAT_ARRAY feature.

PR on the way for fixing our creator feature. 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenAPI Generator PR: SAP/cloud-sdk-java#927

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the above PR + Cloud SDK release a requirement? No right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a requirement. Just a nice to have.

Comment on lines 39 to 53
/** Azure OpenAI Text Embedding 3 Small model */
public static final OrchestrationEmbeddingModel TEXT_EMBEDDING_3_SMALL =
new OrchestrationEmbeddingModel("text-embedding-3-small");

/** Azure OpenAI Text Embedding 3 Large model */
public static final OrchestrationEmbeddingModel TEXT_EMBEDDING_3_LARGE =
new OrchestrationEmbeddingModel("text-embedding-3-large");

/** Amazon Titan Embed Text model */
public static final OrchestrationEmbeddingModel AMAZON_TITAN_EMBED_TEXT =
new OrchestrationEmbeddingModel("amazon.titan-embed-text");

/** NVIDIA LLaMA 3.2 7B NV EmbedQA model */
public static final OrchestrationEmbeddingModel NVIDIA_LLAMA_32_NV_EMBEDQA_1B =
new OrchestrationEmbeddingModel("nvidia--llama-3.2-nv-embedqa-1b");
Copy link
Contributor

@newtork newtork Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Major/Question)

You are introducing another set of model constants we need to maintain. Can't we use existing class(es)?
What is the process of keeping this up-to-date?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My motivation of introducing new class was that embedding models do not share the same parameters as models in OrchestrationAiModel and are not valid for the chatCompletion calls. So, I wanted to avoid a user passing the wrong model to some endpoint.

And I am open to removing this class and extending OrchestrationAiModel. What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About maintenance, ideally, we can use an e2e (scenario) test to maintain this. But, unfortunately, this is pointless right now because SAP Notes and the api response are inconsistent.

We need the test for maintenance either way. But I would rather create a follow up BLI because this requires sync with orchestration.


/** Builder step for specifying text inputs to embed. */
@FunctionalInterface
public interface InputStep {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Minor)

I'm not sure whether we used "Step" as "Builder" substitute already somewhere in the project. If not, please reconsider.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have in fact used it once before in TemplateConfig. I actually like the current approach in how clean it is in usage and enforces both required arguments. But again, if you have a strong opinion, I will choose a static factory. Please confirm.

rpanackal and others added 8 commits September 4, 2025 14:29
# Conflicts:
#	docs/release_notes.md
#	orchestration/src/test/java/com/sap/ai/sdk/orchestration/OrchestrationUnitTest.java
# Conflicts:
#	docs/release_notes.md
#	sample-code/spring-app/src/test/java/com/sap/ai/sdk/app/controllers/OrchestrationTest.java
newtork
newtork previously approved these changes Sep 22, 2025
Copy link
Contributor

@newtork newtork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

# Conflicts:
#	sample-code/spring-app/src/main/java/com/sap/ai/sdk/app/services/OrchestrationService.java
Copy link
Member

@Jonas-Isr Jonas-Isr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Jonas-Isr Jonas-Isr merged commit 4cb0d53 into main Oct 2, 2025
7 checks passed
@Jonas-Isr Jonas-Isr deleted the feat/orchestration/embedding-conv branch October 2, 2025 11:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
please-review Request to review a pull-request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants