本站内所有文档均为中英对照文档，点击中文可以显示英文。此提醒连续关闭5次后，将不再显示。

文档内容来源于 spring.io，由 springdoc.tech 翻译，版权归属于 SPRING.IO (Broadcom. Inc)。可供个人学习、研究，未经许可，不得进行转载或用于商业行为。

Google GenAI 聊天

Deepseek 3.2 中英对照 Google GenAI Google GenAI Chat

Google GenAI API 允许开发者通过 Gemini 开发者 API 或 Vertex AI，使用 Google 的 Gemini 模型构建生成式 AI 应用程序。Google GenAI API 支持多模态提示作为输入，并输出文本或代码。多模态模型能够处理来自多种模态的信息，包括图像、视频和文本。例如，你可以向模型发送一盘饼干的照片，并要求它为你提供该饼干的制作食谱。

Gemini 是由 Google DeepMind 开发的一系列生成式 AI 模型，专为多模态用例而设计。Gemini API 为您提供对 Gemini 2.0 Flash、Gemini 2.0 Flash-Lite、所有 Gemini Pro 模型（包括最新的 Gemini 3 Pro）的访问权限。

此实现提供两种认证模式：

Gemini Developer API：使用API密钥进行快速原型设计和开发
Vertex AI：使用Google Cloud凭据进行具备企业功能的生产环境部署

Gemini API 参考文档

前提条件

请选择以下一种认证方式：

选项 1：Gemini Developer API（API 密钥）

从 Google AI Studio 获取 API 密钥
将 API 密钥设置为环境变量或配置到应用属性中

选项2：Vertex AI（Google Cloud）

安装适用于您操作系统的 gcloud CLI。
运行以下命令进行身份验证。请将 PROJECT_ID 替换为您的 Google Cloud 项目 ID，并将 ACCOUNT 替换为您的 Google Cloud 用户名。

gcloud config set project <PROJECT_ID> &&
gcloud auth application-default login <ACCOUNT>

自动配置

备注

Spring AI 的自动配置和 starter 模块的 artifact 名称发生了重大变化。更多信息请参阅升级说明。

Spring AI 为 Google GenAI Chat Client 提供了 Spring Boot 自动配置功能。要启用此功能，请将以下依赖项添加到项目的 Maven pom.xml 或 Gradle build.gradle 构建文件中：

Maven
Gradle

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-model-google-genai</artifactId>
</dependency>

dependencies {
    implementation 'org.springframework.ai:spring-ai-starter-model-google-genai'
}

:::提示
请参考依赖管理章节，将 Spring AI BOM 添加到你的构建文件中。
:::

聊天属性

备注

现在通过前缀为 spring.ai.model.chat 的顶级属性来配置聊天自动配置的启用与禁用。

要启用：spring.ai.model.chat=google-genai（默认已启用）

要禁用：spring.ai.model.chat=none（或任何与 google-genai 不匹配的值）

此项变更是为了支持配置多个模型。

连接属性

前缀 spring.ai.google.genai 用作属性前缀，让你能够连接到 Google GenAI。

属性	描述	默认值
spring.ai.model.chat	启用聊天模型客户端	google-genai
spring.ai.google.genai.api-key	Gemini Developer API 的 API 密钥。提供此密钥后，客户端将使用 Gemini Developer API 而非 Vertex AI。	-
spring.ai.google.genai.project-id	Google Cloud Platform 项目 ID（Vertex AI 模式下必需）	-
spring.ai.google.genai.location	Google Cloud 区域（Vertex AI 模式下必需）	-
spring.ai.google.genai.credentials-uri	Google Cloud 凭据的 URI。提供此 URI 将用于创建 `GoogleCredentials` 实例进行身份验证。	-

聊天模型属性

前缀 spring.ai.google.genai.chat 是用于配置 Google GenAI Chat 聊天模型实现的属性前缀。

属性	描述	默认
spring.ai.google.genai.chat.options.model	支持的 Google GenAI Chat 模型包括 `gemini-2.0-flash`、`gemini-2.0-flash-lite`、`gemini-pro` 和 `gemini-1.5-flash`。	gemini-2.0-flash
spring.ai.google.genai.chat.options.response-mime-type	生成的候选文本的输出响应 MIME 类型。	`text/plain`: (default) Text output or `application/json`: JSON response.
spring.ai.google.genai.chat.options.google-search-retrieval	使用 Google 搜索 Grounding 功能	`true` or `false`, default `false`.
spring.ai.google.genai.chat.options.temperature	控制输出的随机性。数值范围在 [0.0,1.0] 之间（包含端点）。数值越接近 1.0，生成的回应变化越大；数值越接近 0.0，生成器的回应通常越缺乏意外性。	0.7
spring.ai.google.genai.chat.options.top-k	采样时考虑的最大令牌数量。生成式模型采用Top-k与核心采样相结合的方法。Top-k采样考虑前k个概率最高的令牌集合。	-
spring.ai.google.genai.chat.options.top-p	在采样时考虑的最大累积概率值。生成过程使用 Top-k 和核心采样相结合的方式。核心采样考虑概率总和至少达到 topP 的最小 token 集合。	-
spring.ai.google.genai.chat.options.candidate-count	要返回的生成回复消息数量。此值必须在 1 到 8（含）之间。默认值为 1。	1
spring.ai.google.genai.chat.options.max-output-tokens	生成的最大令牌数量。	-
spring.ai.google.genai.chat.options.frequency-penalty	减少重复的频率惩罚。	-
spring.ai.google.genai.chat.options.presence-penalty	存在惩罚以减少重复。	-
spring.ai.google.genai.chat.options.thinking-budget	思考预算用于思考过程。请参阅思考配置。	-
spring.ai.google.genai.chat.options.thinking-level	模型应生成的思维代币级别。有效值：`LOW`、`HIGH`、`THINKING_LEVEL_UNSPECIFIED`。请参阅思维配置。	-
spring.ai.google.genai.chat.options.include-thoughts	启用函数调用的思维签名。必需为 Gemini 3 Pro 启用，以避免内部工具执行循环期间的验证错误。请参阅思维签名。	false
spring.ai.google.genai.chat.options.tool-names	用于在单个提示请求中启用函数调用的工具列表，通过工具名称标识。这些名称的工具必须存在于 ToolCallback 注册表中。	-
spring.ai.google.genai.chat.options.tool-callbacks	注册到聊天模型的工具回调。	-
spring.ai.google.genai.chat.options.internal-tool-execution-enabled	如果为 true，则执行工具调用，否则将模型的响应返回给用户。默认值为 null，但若为 null，则会采用 `ToolCallingChatOptions.DEFAULT_TOOL_EXECUTION_ENABLED`（其值为 true）。	-
spring.ai.google.genai.chat.options.safety-settings	安全设置列表，用于控制安全过滤器，定义参考 Google GenAI 安全设置。每个安全设置可包含方法、阈值和类别。	-
spring.ai.google.genai.chat.options.cached-content-name	用于此请求的缓存内容名称。当与 `use-cached-content=true` 一同设置时，缓存内容将被用作上下文。请参阅 Cached Content。	-
spring.ai.google.genai.chat.options.use-cached-content	是否使用缓存内容（如果可用）。当设置为 `true` 并且 `cached-content-name` 已配置时，系统将使用缓存内容。	false
spring.ai.google.genai.chat.options.auto-cache-threshold	自动缓存超出此令牌阈值的提示。启用后，超过此值的提示将自动缓存以供重复使用。设置为 `null` 可禁用自动缓存。	-
spring.ai.google.genai.chat.options.auto-cache-ttl	自动缓存内容的生存时间（持续时间），采用 ISO-8601 格式（例如，`PT1H` 表示 1 小时）。在启用自动缓存时使用。	PT1H
spring.ai.google.genai.chat.enable-cached-content	启用 `GoogleGenAiCachedContentService` bean 以管理缓存内容。	真

:::提示
所有以 spring.ai.google.genai.chat.options 为前缀的属性，都可以在运行时通过在 Prompt 调用中添加请求特定的运行时选项来覆盖。
:::

运行时选项

GoogleGenAiChatOptions.java 提供了模型配置选项，例如温度（temperature）、topK 等。

在启动时，可以通过 GoogleGenAiChatModel(client, options) 构造函数或 spring.ai.google.genai.chat.options.* 属性来配置默认选项。

在运行时，你可以通过向 Prompt 调用添加新的、特定于请求的选项来覆盖默认设置。例如，要为特定请求覆盖默认的 temperature 设置：

ChatResponse response = chatModel.call(
    new Prompt(
        "Generate the names of 5 famous pirates.",
        GoogleGenAiChatOptions.builder()
            .temperature(0.4)
        .build()
    ));

提示

除了模型特定的 GoogleGenAiChatOptions 外，您还可以使用一个可移植的 ChatOptions 实例，该实例通过 ChatOptions#builder() 创建。

工具调用

Google GenAI 模型支持工具调用（函数调用）功能，允许模型在对话过程中使用工具。以下是如何定义和使用基于 @Tool 工具的一个示例：

public class WeatherService {

    @Tool(description = "Get the weather in location")
    public String weatherByLocation(@ToolParam(description= "City or state name") String location) {
        ...
    }
}

String response = ChatClient.create(this.chatModel)
        .prompt("What's the weather like in Boston?")
        .tools(new WeatherService())
        .call()
        .content();

你也可以使用java.util.function包中的类作为工具使用：

@Bean
@Description("Get the weather in location. Return temperature in 36°F or 36°C format.")
public Function<Request, Response> weatherFunction() {
    return new MockWeatherService();
}

String response = ChatClient.create(this.chatModel)
        .prompt("What's the weather like in Boston?")
        .toolNames("weatherFunction")
        .inputType(Request.class)
        .call()
        .content();

更多内容请参阅工具文档。

思考配置

Gemini 模型支持"思考"功能，允许模型在生成响应前进行更深层次的推理。这一功能通过 ThinkingConfig 进行控制，它包含三个相关选项：thinkingBudget、thinkingLevel 和 includeThoughts。

思考层级

thinkingLevel 选项用于控制模型生成推理令牌的深度。该功能适用于支持推理的模型（例如，Gemini 3 Pro Preview）。

Value	Description
`LOW`	最小化思考。适用于简单查询，优先考虑速度而非深入分析。
`HIGH`	扩展性思考。适用于需要深入分析和逐步推理的复杂问题。
`THINKING_LEVEL_UNSPECIFIED`	模型使用其默认行为。

通过属性进行配置

spring.ai.google.genai.chat.options.model=gemini-3-pro-preview
spring.ai.google.genai.chat.options.thinking-level=HIGH

编程式配置

import org.springframework.ai.google.genai.common.GoogleGenAiThinkingLevel;

ChatResponse response = chatModel.call(
    new Prompt(
        "Explain the theory of relativity in simple terms.",
        GoogleGenAiChatOptions.builder()
            .model("gemini-3-pro-preview")
            .thinkingLevel(GoogleGenAiThinkingLevel.HIGH)
            .build()
    ));

思考预算

thinkingBudget 选项用于设置思考过程的令牌预算：

正值: 思考令牌的最大数量（例如 8192）
零 (0): 完全禁用思考
未设置: 模型根据查询复杂度自动决定

ChatResponse response = chatModel.call(
    new Prompt(
        "Solve this complex math problem step by step.",
        GoogleGenAiChatOptions.builder()
            .model("gemini-2.5-pro")
            .thinkingBudget(8192)
            .build()
    ));

选项兼容性

:::重要
thinkingLevel 和 thinkingBudget 是互斥的。 您不能在同一个请求中使用两者，否则将导致 API 错误。

对于 Gemini 3 Pro 系列模型，请使用 thinkingLevel (LOW, HIGH)
对于 Gemini 2.5 系列模型，请使用 thinkingBudget (token 数量)
:::

你可以将 includeThoughts 与 thinkingLevel 或 thinkingBudget 中的任意一个结合使用（但不能同时使用两者）：

// For Gemini 3 Pro: use thinkingLevel + includeThoughts
ChatResponse response = chatModel.call(
    new Prompt(
        "Analyze this complex scenario.",
        GoogleGenAiChatOptions.builder()
            .model("gemini-3-pro-preview")
            .thinkingLevel(GoogleGenAiThinkingLevel.HIGH)
            .includeThoughts(true)
            .build()
    ));

// For Gemini 2.5: use thinkingBudget + includeThoughts
ChatResponse response = chatModel.call(
    new Prompt(
        "Analyze this complex scenario.",
        GoogleGenAiChatOptions.builder()
            .model("gemini-2.5-pro")
            .thinkingBudget(8192)
            .includeThoughts(true)
            .build()
    ));

模型支持

思维配置选项是特定于模型的：

模型	思维层级	思维预算	备注
Gemini 3 Pro (预览版)	✅ 支持	⚠️ 仅向后兼容	使用 `thinkingLevel`。无法禁用思维。需要使用全局端点。
Gemini 2.5 Pro	❌ 不支持	✅ 支持	使用 `thinkingBudget`。设置为 0 以禁用，-1 为动态模式。
Gemini 2.5 Flash	❌ 不支持	✅ 支持	使用 `thinkingBudget`。设置为 0 以禁用，-1 为动态模式。
Gemini 2.5 Flash-Lite	❌ 不支持	✅ 支持	默认禁用思维。设置 `thinkingBudget` 以启用。
Gemini 2.0 Flash	❌ 不支持	❌ 不支持	思维功能不可用。

:::重要

在不支持的模型（例如 Gemini 2.5 或更早版本）上使用 thinkingLevel 将导致 API 错误。
Gemini 3 Pro Preview 仅在全球端点上可用。请设置 spring.ai.google.genai.location=global 或 GOOGLE_CLOUD_LOCATION=global。
请查阅 Google GenAI Thinking 文档以获取最新的模型能力信息。
:::

备注

启用思维功能会增加令牌使用量和API成本。请根据查询的复杂性适当使用。

思维签名

Gemini 3 Pro 引入了思维签名，这是一种不透明的字节数组，用于在函数调用期间保留模型的推理上下文。启用 includeThoughts 后，模型将返回思维签名，这些签名必须在内部工具执行循环的同一轮次中传递回来。

当思维特征至关重要

重要提示：思维签名验证仅适用于当前轮次——具体来说，是在模型进行函数调用（包括并行和顺序调用）的内部工具执行循环期间。API不会验证对话历史中先前轮次的思维签名。

根据 Google 的文档:

仅对当前轮次中的函数调用强制执行签名验证
无需保留先前轮次的函数签名
在当前轮次的函数调用中，若缺少签名，会导致 Gemini 3 Pro 返回 HTTP 400 错误
对于并行函数调用，仅第一个 functionCall 部分会携带签名

对于Gemini 2.5 Pro及更早的模型，思维特征签名是可选功能，且API对此要求较为宽松。

配置

使用配置属性启用思维签名：

spring.ai.google.genai.chat.options.model=gemini-3-pro-preview
spring.ai.google.genai.chat.options.include-thoughts=true

或者在运行时以编程方式：

ChatResponse response = chatModel.call(
    new Prompt(
        "Your question here",
        GoogleGenAiChatOptions.builder()
            .model("gemini-3-pro-preview")
            .includeThoughts(true)
            .toolCallbacks(callbacks)
            .build()
    ));

自动处理

Spring AI 会在内部工具执行循环中自动处理思维签名。当 internalToolExecutionEnabled 为 true（默认值）时，Spring AI：

从模型响应中提取思维签名
在发送函数响应时，将其附加到正确的 functionCall 部分
在单轮对话中，在函数调用期间正确传播思维签名（包括并行和顺序调用）

您无需手动管理思维签名——Spring AI 会确保它们按照 API 规范的要求正确附加到 functionCall 部分。

函数调用示例

@Bean
@Description("Get the weather in a location")
public Function<WeatherRequest, WeatherResponse> weatherFunction() {
    return new WeatherService();
}

// Enable includeThoughts for Gemini 3 Pro with function calling
String response = ChatClient.create(this.chatModel)
        .prompt("What's the weather like in Boston?")
        .options(GoogleGenAiChatOptions.builder()
            .model("gemini-3-pro-preview")
            .includeThoughts(true)
            .build())
        .toolNames("weatherFunction")
        .call()
        .content();

手动工具执行模式

如果你将 internalToolExecutionEnabled=false 设置为手动控制工具执行循环，那么在使用 Gemini 3 Pro 并设置 includeThoughts=true 时，你必须自行处理思维签名。

手动执行工具的要求（附带思考签名）：

从响应元数据中提取思维签名：

AssistantMessage assistantMessage = response.getResult().getOutput();
Map<String, Object> metadata = assistantMessage.getMetadata();
List<byte[]> thoughtSignatures = (List<byte[]>) metadata.get("thoughtSignatures");

当发送函数响应时，请在你的消息历史中包含原始的 AssistantMessage 及其完整的元数据。Spring AI 会自动将思维签名附加到正确的 functionCall 部分。
对于 Gemini 3 Pro，如果在当前轮次中未能保留思维签名，将导致来自 API 的 HTTP 400 错误。

:::重要
仅当前轮次的函数调用需要思考签名。当开始一个新的对话轮次（在完成一轮函数调用之后），你无需保留前一轮的签名。
:::

:::注意
启用 includeThoughts 会增加令牌使用量，因为响应中包含了思维过程。这会影响 API 成本，但能提供更好的推理透明度。
:::

多模态

多模态（Multimodality）指的是模型同时理解和处理来自多种（输入）源信息的能力，这些信息源包括 text、pdf、images、audio 以及其他数据格式。

图像、音频、视频

Google的Gemini AI模型具备理解并整合文本、代码、音频、图像和视频的能力。如需了解更多详情，请参阅博客文章Introducing Gemini。

Spring AI的Message接口通过引入Media类型来支持多模态AI模型。该类型包含消息中媒体附件的数据与信息，使用Spring的org.springframework.util.MimeType和java.lang.Object来存储原始媒体数据。

以下是一个简单的代码示例，摘自 GoogleGenAiChatModelIT.java，演示了如何将用户文本与图像结合使用。

byte[] data = new ClassPathResource("/vertex-test.png").getContentAsByteArray();

var userMessage = UserMessage.builder()
			.text("Explain what do you see o this picture?")
			.media(List.of(new Media(MimeTypeUtils.IMAGE_PNG, data)))
			.build();

ChatResponse response = chatModel.call(new Prompt(List.of(this.userMessage)));

PDF

Google GenAI 支持 PDF 输入类型。使用 application/pdf 媒体类型将 PDF 文件附加到消息中：

var pdfData = new ClassPathResource("/spring-ai-reference-overview.pdf");

var userMessage = UserMessage.builder()
			.text("You are a very professional document summarization specialist. Please summarize the given document.")
			.media(List.of(new Media(new MimeType("application", "pdf"), pdfData)))
			.build();

var response = this.chatModel.call(new Prompt(List.of(userMessage)));

缓存内容

Google GenAI 的上下文缓存功能允许您缓存大量内容（例如长文档、代码仓库或媒体文件），并在多个请求中重复使用。这能显著降低 API 成本，并改善对同一内容的重复查询的响应延迟。

优点

成本降低：缓存 Token 的计费远低于常规输入 Token（通常便宜 75-90%）
性能提升：复用缓存内容可缩短大型上下文的处理时间
一致性：相同的缓存上下文确保跨多个请求的响应保持一致

缓存要求

最小缓存大小：32,768 个令牌（约 25,000 字）
最大缓存时长：默认 1 小时（可通过 TTL 配置）
缓存内容必须包含系统指令或对话历史记录

使用缓存内容服务

Spring AI 提供了 GoogleGenAiCachedContentService 用于编程式缓存管理。在使用 Spring Boot 自动配置时，该服务会自动配置。

创建缓存内容

@Autowired
private GoogleGenAiCachedContentService cachedContentService;

// Create cached content with a large document
String largeDocument = "... your large context here (>32k tokens) ...";

CachedContentRequest request = CachedContentRequest.builder()
    .model("gemini-2.0-flash")
    .contents(List.of(
        Content.builder()
            .role("user")
            .parts(List.of(Part.fromText(largeDocument)))
            .build()
    ))
    .displayName("My Large Document Cache")
    .ttl(Duration.ofHours(1))
    .build();

GoogleGenAiCachedContent cachedContent = cachedContentService.create(request);
String cacheName = cachedContent.getName(); // Save this for reuse

在聊天请求中使用缓存内容

创建缓存内容后，在您的聊天请求中引用它：

ChatResponse response = chatModel.call(
    new Prompt(
        "Summarize the key points from the document",
        GoogleGenAiChatOptions.builder()
            .useCachedContent(true)
            .cachedContentName(cacheName) // Use the cached content name
            .build()
    ));

或者通过配置属性：

spring.ai.google.genai.chat.options.use-cached-content=true
spring.ai.google.genai.chat.options.cached-content-name=cachedContent/your-cache-name

管理缓存内容

GoogleGenAiCachedContentService 提供全面的缓存管理功能：

// Retrieve cached content
GoogleGenAiCachedContent content = cachedContentService.get(cacheName);

// Update cache TTL
CachedContentUpdateRequest updateRequest = CachedContentUpdateRequest.builder()
    .ttl(Duration.ofHours(2))
    .build();
GoogleGenAiCachedContent updated = cachedContentService.update(cacheName, updateRequest);

// List all cached content
List<GoogleGenAiCachedContent> allCaches = cachedContentService.listAll();

// Delete cached content
boolean deleted = cachedContentService.delete(cacheName);

// Extend cache TTL
GoogleGenAiCachedContent extended = cachedContentService.extendTtl(cacheName, Duration.ofMinutes(30));

// Cleanup expired caches
int removedCount = cachedContentService.cleanupExpired();

异步操作

所有操作都有异步变体：

CompletableFuture<GoogleGenAiCachedContent> futureCache =
    cachedContentService.createAsync(request);

CompletableFuture<GoogleGenAiCachedContent> futureGet =
    cachedContentService.getAsync(cacheName);

CompletableFuture<Boolean> futureDelete =
    cachedContentService.deleteAsync(cacheName);

自动缓存

Spring AI 能够在提示词长度超过指定令牌阈值时自动进行缓存：

# Automatically cache prompts larger than 100,000 tokens
spring.ai.google.genai.chat.options.auto-cache-threshold=100000
# Set auto-cache TTL to 1 hour
spring.ai.google.genai.chat.options.auto-cache-ttl=PT1H

或者以编程方式：

ChatResponse response = chatModel.call(
    new Prompt(
        largePrompt,
        GoogleGenAiChatOptions.builder()
            .autoCacheThreshold(100000)
            .autoCacheTtl(Duration.ofHours(1))
            .build()
    ));

备注

自动缓存适用于一次性的大规模上下文场景。对于重复使用相同上下文的情况，手动创建和引用缓存内容会更加高效。

监控缓存使用情况

缓存内容包括可通过服务访问的使用元数据：

GoogleGenAiCachedContent content = cachedContentService.get(cacheName);

// Check if cache is expired
boolean expired = content.isExpired();

// Get remaining TTL
Duration remaining = content.getRemainingTtl();

// Get usage metadata
CachedContentUsageMetadata metadata = content.getUsageMetadata();
if (metadata != null) {
    System.out.println("Total tokens: " + metadata.totalTokenCount().orElse(0));
}

最佳实践

缓存生命周期：根据您的使用场景设置适当的 TTL。频繁变化的内容设置较短的 TTL，静态内容设置较长的 TTL。
缓存命名：使用描述性的显示名称，以便轻松识别缓存内容。
清理：定期清理过期的缓存，以保持组织有序。
令牌阈值：仅缓存超过最低阈值（32,768 令牌）的内容。
成本优化：在多个请求中重用缓存内容，以最大限度地节省成本。

配置示例

完整配置示例：

# Enable cached content service (enabled by default)
spring.ai.google.genai.chat.enable-cached-content=true

# Use a specific cached content
spring.ai.google.genai.chat.options.use-cached-content=true
spring.ai.google.genai.chat.options.cached-content-name=cachedContent/my-cache-123

# Auto-caching configuration
spring.ai.google.genai.chat.options.auto-cache-threshold=50000
spring.ai.google.genai.chat.options.auto-cache-ttl=PT30M

示例控制器

创建一个新的Spring Boot项目，并将 spring-ai-starter-model-google-genai 添加到你的pom（或gradle）依赖中。

在 src/main/resources 目录下，添加一个 application.properties 文件，以启用并配置 Google GenAI 聊天模型：

使用 Gemini Developer API（API 密钥）

spring.ai.google.genai.api-key=YOUR_API_KEY
spring.ai.google.genai.chat.options.model=gemini-2.0-flash
spring.ai.google.genai.chat.options.temperature=0.5

使用 Vertex AI

spring.ai.google.genai.project-id=PROJECT_ID
spring.ai.google.genai.location=LOCATION
spring.ai.google.genai.chat.options.model=gemini-2.0-flash
spring.ai.google.genai.chat.options.temperature=0.5

提示

请将 project-id 替换为你的 Google Cloud 项目 ID，location 是 Google Cloud 区域，例如 us-central1、europe-west1 等…

备注

每个模型都有自己支持的区域列表，你可以在模型页面中找到支持的区域列表。

这将创建一个GoogleGenAiChatModel实现，你可以将其注入到你的类中。以下是一个使用该聊天模型进行文本生成的简单@Controller类示例。

@RestController
public class ChatController {

    private final GoogleGenAiChatModel chatModel;

    @Autowired
    public ChatController(GoogleGenAiChatModel chatModel) {
        this.chatModel = chatModel;
    }

    @GetMapping("/ai/generate")
    public Map generate(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        return Map.of("generation", this.chatModel.call(message));
    }

    @GetMapping("/ai/generateStream")
	public Flux<ChatResponse> generateStream(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        Prompt prompt = new Prompt(new UserMessage(message));
        return this.chatModel.stream(prompt);
    }
}

手动配置

GoogleGenAiChatModel 实现了 ChatModel 接口，并使用 com.google.genai.Client 连接到 Google GenAI 服务。

将 spring-ai-google-genai 依赖项添加到项目的 Maven pom.xml 文件中：

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-google-genai</artifactId>
</dependency>

或将其添加到您的 Gradle build.gradle 构建文件中。

dependencies {
    implementation 'org.springframework.ai:spring-ai-google-genai'
}

提示

请参考依赖管理章节，将 Spring AI BOM 添加到你的构建文件中。

接下来，创建一个 GoogleGenAiChatModel 并使用它进行文本生成：

使用 API 密钥

Client genAiClient = Client.builder()
    .apiKey(System.getenv("GOOGLE_API_KEY"))
    .build();

var chatModel = new GoogleGenAiChatModel(genAiClient,
    GoogleGenAiChatOptions.builder()
        .model(ChatModel.GEMINI_2_0_FLASH)
        .temperature(0.4)
    .build());

ChatResponse response = this.chatModel.call(
    new Prompt("Generate the names of 5 famous pirates."));

使用 Vertex AI

Client genAiClient = Client.builder()
    .project(System.getenv("GOOGLE_CLOUD_PROJECT"))
    .location(System.getenv("GOOGLE_CLOUD_LOCATION"))
    .vertexAI(true)
    .build();

var chatModel = new GoogleGenAiChatModel(genAiClient,
    GoogleGenAiChatOptions.builder()
        .model(ChatModel.GEMINI_2_0_FLASH)
        .temperature(0.4)
    .build());

ChatResponse response = this.chatModel.call(
    new Prompt("Generate the names of 5 famous pirates."));

GoogleGenAiChatOptions 为聊天请求提供配置信息。GoogleGenAiChatOptions.Builder 是一个流畅的选项构建器。

从 Vertex AI Gemini 迁移

如果你目前使用的是 Vertex AI Gemini 实现（spring-ai-vertex-ai-gemini），只需进行少量更改即可迁移到 Google GenAI：

关键差异

SDK：Google GenAI 使用新的 com.google.genai.Client 替代 com.google.cloud.vertexai.VertexAI
认证：支持 API 密钥和 Google Cloud 凭据两种方式
包名：类位于 org.springframework.ai.google.genai 而非 org.springframework.ai.vertexai.gemini
属性前缀：使用 spring.ai.google.genai 替代 spring.ai.vertex.ai.gemini

何时使用 Google GenAI 与 Vertex AI Gemini

使用 Google GenAI 的情况： - 当您希望通过 API 密钥快速进行原型设计时 - 当您需要开发者 API 提供的最新 Gemini 功能时 - 当您希望灵活地在 API 密钥模式和 Vertex AI 模式之间切换时

使用 Vertex AI Gemini 的场景： - 当您已有现有的 Vertex AI 基础设施时 - 当您需要特定的 Vertex AI 企业级功能时 - 当您的组织要求仅部署在 Google Cloud 上时

低阶 Java 客户端

谷歌GenAI的实现基于全新的Google GenAI Java SDK，该SDK为访问Gemini模型提供了现代化、简洁的API。

前提条件​

选项 1：Gemini Developer API（API 密钥）​

选项2：Vertex AI（Google Cloud）​

自动配置​

聊天属性​

连接属性​

聊天模型属性​

运行时选项​

工具调用​

思考配置​

思考层级​

通过属性进行配置​

编程式配置​

思考预算​

选项兼容性​

模型支持​

思维签名​

当思维特征至关重要​

配置​

自动处理​

函数调用示例​

手动工具执行模式​

多模态​

图像、音频、视频​

PDF​

缓存内容​

优点​

缓存要求​

使用缓存内容服务​

创建缓存内容​

在聊天请求中使用缓存内容​

管理缓存内容​

异步操作​

自动缓存​

监控缓存使用情况​

最佳实践​

配置示例​

示例控制器​

使用 Gemini Developer API（API 密钥）​

使用 Vertex AI​

手动配置​

使用 API 密钥​

使用 Vertex AI​

从 Vertex AI Gemini 迁移​

关键差异​

何时使用 Google GenAI 与 Vertex AI Gemini​

低阶 Java 客户端​

前提条件

选项 1：Gemini Developer API（API 密钥）

选项2：Vertex AI（Google Cloud）

自动配置

聊天属性

连接属性

聊天模型属性

运行时选项

工具调用

思考配置

思考层级

通过属性进行配置

编程式配置

思考预算

选项兼容性

模型支持

思维签名

当思维特征至关重要

配置

自动处理

函数调用示例

手动工具执行模式

多模态

图像、音频、视频

PDF

缓存内容

优点

缓存要求

使用缓存内容服务

创建缓存内容

在聊天请求中使用缓存内容

管理缓存内容

异步操作

自动缓存

监控缓存使用情况

最佳实践

配置示例

示例控制器

使用 Gemini Developer API（API 密钥）

使用 Vertex AI

手动配置

使用 API 密钥

使用 Vertex AI

从 Vertex AI Gemini 迁移

关键差异

何时使用 Google GenAI 与 Vertex AI Gemini

低阶 Java 客户端