本站内所有文档均为中英对照文档，点击中文可以显示英文。此提醒连续关闭5次后，将不再显示。

文档内容来源于 spring.io，由 springdoc.tech 翻译，版权归属于 SPRING.IO (Broadcom. Inc)。可供个人学习、研究，未经许可，不得进行转载或用于商业行为。

转录 API

Deepseek 3.2 中英对照 Transcription API

Spring AI通过TranscriptionModel接口为语音转文本（Speech-to-Text）转录提供了统一的API。这使您能够编写可在不同转录提供商之间通用的可移植代码。

支持的提供商

通用接口

所有转录服务提供商均遵循以下通用接口：

转录模型

TranscriptionModel 接口提供了将音频转换为文本的方法：

public interface TranscriptionModel extends Model<AudioTranscriptionPrompt, AudioTranscriptionResponse> {

    /**
     * Transcribes the audio from the given prompt.
     */
    AudioTranscriptionResponse call(AudioTranscriptionPrompt transcriptionPrompt);

    /**
     * A convenience method for transcribing an audio resource.
     */
    default String transcribe(Resource resource) {
        AudioTranscriptionPrompt prompt = new AudioTranscriptionPrompt(resource);
        return this.call(prompt).getResult().getOutput();
    }

    /**
     * A convenience method for transcribing an audio resource with options.
     */
    default String transcribe(Resource resource, AudioTranscriptionOptions options) {
        AudioTranscriptionPrompt prompt = new AudioTranscriptionPrompt(resource, options);
        return this.call(prompt).getResult().getOutput();
    }
}

AudioTranscriptionPrompt

AudioTranscriptionPrompt 类封装了输入音频和选项：

Resource audioFile = new FileSystemResource("/path/to/audio.mp3");
AudioTranscriptionPrompt prompt = new AudioTranscriptionPrompt(
    audioFile,
    options
);

AudioTranscriptionResponse

AudioTranscriptionResponse 类包含转写的文本和元数据：

AudioTranscriptionResponse response = model.call(prompt);
String transcribedText = response.getResult().getOutput();
AudioTranscriptionResponseMetadata metadata = response.getMetadata();

编写供应商无关代码

共享转录接口的一个关键优势在于，能够编写适用于任何转录提供商的代码，而无需进行修改。实际的提供商（OpenAI、Azure OpenAI 等）由您的 Spring Boot 配置决定，这使得您无需更改应用程序代码即可切换提供商。

基本服务示例

共享接口允许您编写能与任何转录服务提供商兼容的代码：

@Service
public class TranscriptionService {

    private final TranscriptionModel transcriptionModel;

    public TranscriptionService(TranscriptionModel transcriptionModel) {
        this.transcriptionModel = transcriptionModel;
    }

    public String transcribeAudio(Resource audioFile) {
        return transcriptionModel.transcribe(audioFile);
    }

    public String transcribeWithOptions(Resource audioFile, AudioTranscriptionOptions options) {
        AudioTranscriptionPrompt prompt = new AudioTranscriptionPrompt(audioFile, options);
        AudioTranscriptionResponse response = transcriptionModel.call(prompt);
        return response.getResult().getOutput();
    }
}

这项服务与OpenAI、Azure OpenAI或任何其他转录服务提供商无缝协作，具体实现方式取决于您的Spring Boot配置。

供应商特定功能

共享接口虽然提供了可移植性，但每个提供商也通过特定于提供商的选项类（例如 OpenAiAudioTranscriptionOptions、AzureOpenAiAudioTranscriptionOptions）提供特定功能。这些类在实现 AudioTranscriptionOptions 接口的同时，还添加了特定于提供商的功能。

有关提供程序特定功能的详细信息，请参阅各提供程序的文档页面。

转录 API

支持的提供商

通用接口

转录模型

AudioTranscriptionPrompt

AudioTranscriptionResponse

编写供应商无关代码

基本服务示例

供应商特定功能

章节摘要

📄️ Azure OpenAI

📄️ OpenAI

支持的提供商​

通用接口​

转录模型​

AudioTranscriptionPrompt​

AudioTranscriptionResponse​

编写供应商无关代码​

基本服务示例​

供应商特定功能​

章节摘要​

📄️ Azure OpenAI

📄️ OpenAI

支持的提供商

通用接口

转录模型

AudioTranscriptionPrompt

AudioTranscriptionResponse

编写供应商无关代码

基本服务示例

供应商特定功能

章节摘要