-
Notifications
You must be signed in to change notification settings - Fork 15
feat: [Orchestration] Embedding Convenience #562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- client method for high level objects - Only support 'float' encoding format - Add javadoc - create convenience embedding response class - Add unit/integration tests - Add json payloads for req and res - Add e2e
final var bigDecimals = (Embedding.InnerBigDecimals) container.getEmbedding(); | ||
final var values = bigDecimals.values(); | ||
final float[] arr = new float[values.size()]; | ||
for (int i = 0; i < values.size(); i++) { | ||
arr[i] = values.get(i).floatValue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an unfortunate consequence of the openapi generator flag <useFloatArrays>true</useFloatArrays>
not being supported for
oneOf: # works without the `oneOf`
- type: array
items:
type: integer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is concerning:
See memory footprint comparison
Memory Usage Breakdown
float[]
array:
- Each
float
uses exactly 4 bytes - Array overhead: ~12-16 bytes (object header)
- Total for 1000 elements: ~4,012 bytes
List<BigDecimal>
(assuming ArrayList):
ArrayList
overhead: ~24 bytes + internal array overhead- Each
BigDecimal
object: ~40-48 bytes (object header +BigInteger
+ scale + precision) - Each
BigInteger
inside: ~24-32 bytes + int array for digits - Boxing overhead from list storage
- Total for 1000 elements: ~80,000-100,000 bytes
- Is this limitation originating from the openapi generator or our creators feature?
- Do you think it's possible to fix it in our creators feature?
- Is a follow-up BLI already considered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- openapi generator donot support this at all
- our creator feature doesn't account for the combined use of USE_FLOAT_ARRAY feature.
PR on the way for fixing our creator feature. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OpenAPI Generator PR: SAP/cloud-sdk-java#927
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the above PR + Cloud SDK release a requirement? No right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a requirement. Just a nice to have.
orchestration/src/main/java/com/sap/ai/sdk/orchestration/OrchestrationEmbeddingResponse.java
Show resolved
Hide resolved
/** Azure OpenAI Text Embedding 3 Small model */ | ||
public static final OrchestrationEmbeddingModel TEXT_EMBEDDING_3_SMALL = | ||
new OrchestrationEmbeddingModel("text-embedding-3-small"); | ||
|
||
/** Azure OpenAI Text Embedding 3 Large model */ | ||
public static final OrchestrationEmbeddingModel TEXT_EMBEDDING_3_LARGE = | ||
new OrchestrationEmbeddingModel("text-embedding-3-large"); | ||
|
||
/** Amazon Titan Embed Text model */ | ||
public static final OrchestrationEmbeddingModel AMAZON_TITAN_EMBED_TEXT = | ||
new OrchestrationEmbeddingModel("amazon.titan-embed-text"); | ||
|
||
/** NVIDIA LLaMA 3.2 7B NV EmbedQA model */ | ||
public static final OrchestrationEmbeddingModel NVIDIA_LLAMA_32_NV_EMBEDQA_1B = | ||
new OrchestrationEmbeddingModel("nvidia--llama-3.2-nv-embedqa-1b"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Major/Question)
You are introducing another set of model constants we need to maintain. Can't we use existing class(es)?
What is the process of keeping this up-to-date?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My motivation of introducing new class was that embedding models do not share the same parameters as models in OrchestrationAiModel
and are not valid for the chatCompletion
calls. So, I wanted to avoid a user passing the wrong model to some endpoint.
And I am open to removing this class and extending OrchestrationAiModel
. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About maintenance, ideally, we can use an e2e (scenario) test to maintain this. But, unfortunately, this is pointless right now because SAP Notes and the api response are inconsistent.
We need the test for maintenance either way. But I would rather create a follow up BLI because this requires sync with orchestration.
orchestration/src/main/java/com/sap/ai/sdk/orchestration/OrchestrationEmbeddingRequest.java
Show resolved
Hide resolved
|
||
/** Builder step for specifying text inputs to embed. */ | ||
@FunctionalInterface | ||
public interface InputStep { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Minor)
I'm not sure whether we used "Step" as "Builder" substitute already somewhere in the project. If not, please reconsider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have in fact used it once before in TemplateConfig
. I actually like the current approach in how clean it is in usage and enforces both required arguments. But again, if you have a strong opinion, I will choose a static factory. Please confirm.
orchestration/src/main/java/com/sap/ai/sdk/orchestration/OrchestrationEmbeddingRequest.java
Outdated
Show resolved
Hide resolved
orchestration/src/main/java/com/sap/ai/sdk/orchestration/OrchestrationEmbeddingRequest.java
Outdated
Show resolved
Hide resolved
# Conflicts: # docs/release_notes.md # orchestration/src/test/java/com/sap/ai/sdk/orchestration/OrchestrationUnitTest.java
# Conflicts: # docs/release_notes.md # sample-code/spring-app/src/test/java/com/sap/ai/sdk/app/controllers/OrchestrationTest.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
# Conflicts: # sample-code/spring-app/src/main/java/com/sap/ai/sdk/app/services/OrchestrationService.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Context
AI/ai-sdk-java-backlog#313.
We would like to introduce convenience for embedding in orchestration module.
Feature scope:
OrchestrationEmbeddingModel
,OrchestrationEmbeddingRequest
andOrchestrationEmbeddingResponse
OrchestrationClient#embed(OrchestrationEmbeddingRequest)
List<float[]>
. API parity with OpenAIList<float[]>
. Setting generator config<useFloatArrays>true</useFloatArrays>
seems ineffective. Needs investigation.Definition of Done
Error handling created / updated & covered by the tests aboveAligned changes with the JavaScript SDK