Skip to content

Common types for text-to-speech #1496

@habuma

Description

@habuma

Similar to what I suggested in #1478, it would be great if text-to-speech had a set of common types. The SpeechModel interface, as well as SpeechPrompt, SpeechResponse, and StreamingSpeechModel feel like what I'd expect with such types, but they are currently delivered in the OpenAI module. Even though OpenAI is the only implementation, it feels like those types should be in core with the implementations and OpenAI-specific extensions in the OpenAI module.

Moreover, while SpeechPrompt feels like it should be in core, it carries OpenAiAudioSpeechOptions. Perhaps there should be a more generic SpeechOptions that is carried by SpeechPrompt, with OpenAiAudioSpeechOptions being an extension of SpeechOptions.

Altogether, this would not only make the types more consistent with how the types for chat and other models are structured, it also sets the stage for additional text-to-speech implementations should more APIs that offer that be added to Spring AI.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions