Common types for text-to-speech

Similar to what I suggested in #1478, it would be great if text-to-speech had a set of common types. The `SpeechModel` interface, as well as `SpeechPrompt`, `SpeechResponse`, and `StreamingSpeechModel` feel like what I'd expect with such types, but they are currently delivered in the OpenAI module. Even though OpenAI is the only implementation, it feels like those types should be in core with the implementations and OpenAI-specific extensions in the OpenAI module.

Moreover, while `SpeechPrompt` feels like it should be in core, it carries `OpenAiAudioSpeechOptions`. Perhaps there should be a more generic `SpeechOptions` that is carried by `SpeechPrompt`, with `OpenAiAudioSpeechOptions` being an extension of `SpeechOptions`. 

Altogether, this would not only make the types more consistent with how the types for chat and other models are  structured, it also sets the stage for additional text-to-speech implementations should more APIs that offer that be added to Spring AI.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Common types for text-to-speech #1496

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Common types for text-to-speech #1496

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions