Spring AI 从入门到精通

快速入门

官方文档：Introduction :: Spring AI Reference （另外一套竞品API：LangChain4j）

Spring AI 项目旨在简化集成人工智能功能的应用程序开发，避免不必要的复杂性。（深度整合Spring生态，利用Springboot自动装配、AOP技术大大简化大模型应用开发代码）

该项目的灵感源自 LangChain 和 LlamaIndex 等知名 Python 项目，但 Spring AI 并非这些项目的直接移植。

该项目的创立基于这样一种信念：下一波生成式人工智能应用浪潮将不仅面向 Python 开发者，还将在众多编程语言中无处不在。

Spring AI 解决了 AI 集成的基本挑战：将企业数据和 API 与 AI 模型连接起来。

Spring AI 提供了一些抽象，作为开发 AI 应用程序的基础。这些抽象具有多种实现，只需极少的代码更改即可轻松实现组件的替换。

主要特征

支持主流AI模型
支持结构化输出
支持主流向量数据库
工具调用
封装文档清洗引擎 ETL framework
组件的 Spring Boot Starters
提供高度封装组件 ChatClient API
提供增强器 Advisors API
支持会话历史内存记录和RAG应用相关封装

核心概念

Model

AI 模型是旨在处理和生成信息的算法，通常模仿人类的认知功能。通过从大型数据集中学习模式和洞察，这些模型可以进行预测、文本、图像或其他输出，从而增强各行各业的各种应用。

AI 模型种类繁多，每种模型都适用于特定的用例。虽然 ChatGPT 及其生成式 AI 功能通过文本输入和输出吸引了用户，但许多模型和公司都提供多样化的输入和输出。

Prompts

提示是基于语言的输入的基础，这些输入引导 AI 模型生成特定的输出。ChatGPT 的 API 在一个提示中包含多个文本输入，每个文本输入都被分配一个角色。

例如，系统角色会告诉模型如何操作并设置交互的上下文。用户角色通常是来自用户的输入。设计有效的提示既是一门艺术，也是一门科学。

Prompt 中的主要角色（Role）包括：

系统角色（System Role）：指导 AI 的行为和响应方式，设置 AI 如何解释和回复输入的参数或规则。这类似于在发起对话之前向 AI 提供说明。
用户角色（User Role）：代表用户的输入 - 他们向 AI 提出的问题、命令或陈述。这个角色至关重要，因为它构成了 AI 响应的基础。
助手角色（Assistant Role）：AI 对用户输入的响应。这不仅仅是一个答案或反应，它对于保持对话的流畅性至关重要。通过跟踪 AI 之前的响应（其“助手角色”消息），系统可确保连贯且上下文相关的交互。助手消息也可能包含功能工具调用请求信息。它就像 AI 中的一个特殊功能，在需要执行特定功能（例如计算、获取数据或不仅仅是说话）时使用。
工具/功能角色（Tool/Function Role）：工具/功能角色专注于响应工具调用助手消息返回附加信息。

Prompts Template

创建有效的提示符需要建立请求的上下文，并使用特定于用户输入的值替换请求的某些部分。此过程使用传统的基于文本的模板引擎来创建和管理提示符。Spring AI 为此使用了 OSS 库 StringTemplate。

Embeddings

嵌入是文本、图像或视频的数值表示，用于捕捉输入之间的关系。嵌入的工作原理是将文本、图像和视频转换为浮点数数组（称为向量）。这些向量旨在捕捉文本、图像和视频的含义。嵌入数组的长度称为向量的维数。

通过计算两段文本的向量表示之间的数值距离，应用程序可以确定用于生成嵌入向量的对象之间的相似度。

Tokens

词条是 AI 模型运作的基石。输入时，模型将单词转换为词条。输出时，模型将词条转换回单词。

在英语中，一个 token 大约对应一个单词的 75%。作为参考，莎士比亚的全集总共约 90 万个单词，翻译过来大约有 120 万个 token。

此外，模型还受到 token 限制，这会限制单个 API 调用中处理的文本量。此阈值通常称为“上下文窗口”。模型不会处理超出此限制的任何文本。

Bringing Your Data & APIs to the AI Model

大模型是基于历史数据训练，不能获取实时的信息。有三种技术可以定制 AI 模型来整合您的数据：

微调：这种传统的机器学习技术涉及定制模型并更改其内部权重。
提示词填充：一种更实用的替代方案是将您的数据嵌入到提供给模型的提示中。
工具调用：此技术允许注册自定义的用户函数，将大型语言模型连接到外部系统的 API。

二、入门示例

Spring AI从spring Boot3.x版本开始支持。一个参考pom.xml文件

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd"><modelVersion>4.0.0</modelVersion><parent><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-parent</artifactId><version>3.4.5</version><relativePath/> <!-- lookup parent from repository --></parent><groupId>com.bigo</groupId><artifactId>hello-ai-example</artifactId><version>0.0.1-SNAPSHOT</version><name>hello-ai-example</name><properties><java.version>17</java.version><spring-ai.version>1.0.0-M7</spring-ai.version></properties><dependencies><!-- 普通web --><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-web</artifactId></dependency><!-- AI相关 --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-starter-model-openai</artifactId></dependency></dependencies><dependencyManagement><dependencies><!-- 依赖管理 --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-bom</artifactId><version>${spring-ai.version}</version><type>pom</type><scope>import</scope></dependency></dependencies></dependencyManagement><build><plugins><plugin><groupId>org.springframework.boot</groupId><artifactId>spring-boot-maven-plugin</artifactId></plugin></plugins></build>
</project>

配置模型产商和密钥参数

spring:ai:openai:api-key: xxxx # 密钥keybase-url: api.openai.com # 产商地址chat:options:model: gpt-4o-mini # 大模型名称

直接调用

@Autowired
private ChatClient chatClient;@GetMapping("/ai")
String generation(String userInput) {return this.chatClient.prompt().user("What day is tomorrow?").call().content();
}

总结：使用主要分为三步

1、引入 ai starter 依赖

2、配置模型相关参数

3、通过bean组件调用API，获取结果。

三、重要组件介绍

构造 ChatClient 对象

ChatClient 提供流畅的 API，用于与 AI 模型进行通信。它支持同步和流式编程模型。

AI 模型主要处理两种类型的消息：用户消息（来自用户的直接输入）和系统消息（由系统生成，用于指导对话）。

这些消息通常包含占位符，这些占位符会在运行时根据用户输入进行替换，以自定义 AI 模型对用户输入的响应。

参考：Chat Client API :: Spring AI Reference

还可以指定 Prompt 选项，例如要使用的 AI 模型的名称以及控制生成输出的随机性或创造性的温度设置。

引入的spring boot starter中会自动往容器中放入 ChatClient.Builder bean，可以通过此bean来创建 chatClient 对象。

@RestController
class MyController {private final ChatClient chatClient;// 通过builder对象生成chatClientpublic MyController(ChatClient.Builder chatClientBuilder) {this.chatClient = chatClientBuilder.build();}@GetMapping("/ai")String generation(String userInput) {return this.chatClient.prompt().user(userInput).call().content();}
}

指定默认参数

在 @Configuration 类中创建带有默认系统文本的 ChatClient 可以简化运行时代码。通过设置默认值，您只需在调用 ChatClient 时指定用户文本，无需在运行时代码路径中为每个请求设置系统文本。

@Configuration
public class AiConfig {@BeanChatClient chatClient(ChatClient.Builder builder) {return builder.defaultAdvisors(new SimpleLoggerAdvisor()) // 方便查看debug日志.defaultAdvisors(new MessageChatMemoryAdvisor(new InMemoryChatMemory())) // 存储对话历史，对话历史的保持方式也支持多种，如内存、其他数据库等.defaultSystem("你是一个Apollo客服助手，负责回答Apollo配置中心相关的问题。") // 指定系统文本，指导AI回应行为.build();}
}

其他常用选项

defaultOptions
```
defaultTools
```
defaultAdvisors

Model API

跨 AI 提供商的可移植模型 API，适用于聊天、文本转图像、音频转录、文本转语音和嵌入模型。支持同步和流式 API 选项。

ChatModel API 让应用开发者可以非常方便的与 AI 模型进行文本交互，它抽象了应用与模型交互的过程，包括使用 Prompt 作为输入，使用 ChatResponse 作为输出等。

ChatModel 的工作原理是接收 Prompt 或部分对话作为输入，将输入发送给后端大模型，模型根据其训练数据和对自然语言的理解生成对话响应，应用程序可以将响应呈现给用户或用于进一步处理。

Chat Models

Spring AI chat Model API 旨在成为一个简单易用的接口，用于与各种 AI 模型进行交互，允许开发者以最少的代码更改在不同模型之间切换。这种设计符合 Spring 的模块化和可互换性理念。下面以open AI产商为例子，简单应用下。

Property	Description	Default
spring.ai.openai.base-url	The URL to connect to	api.openai.com
spring.ai.openai.api-key	The API Key	-
spring.ai.openai.organization-id	Optionally, you can specify which organization to use for an API request.	-
spring.ai.openai.project-id	Optionally, you can specify which project to use for an API request.	-

The prefix spring.ai.openai.chat is the property prefix that lets you configure the chat model implementation for OpenAI.

Property	Description	Default
spring.ai.openai.chat.enabled (Removed and no longer valid)	Enable OpenAI chat model.	true
spring.ai.model.chat	Enable OpenAI chat model.	openai
spring.ai.openai.chat.base-url	Optional override for the `spring.ai.openai.base-url` property to provide a chat-specific URL.	-
spring.ai.openai.chat.completions-path	The path to append to the base URL.	`/v1/chat/completions`
spring.ai.openai.chat.api-key	Optional override for the `spring.ai.openai.api-key` to provide a chat-specific API Key.	-
spring.ai.openai.chat.options.model	Name of the OpenAI chat model to use. You can select between models such as: `gpt-4o`, `gpt-4o-mini`, `gpt-4-turbo`, `gpt-3.5-turbo`, and more. See the models page for more information.	`gpt-4o-mini`
spring.ai.openai.chat.options.temperature	The sampling temperature to use that controls the apparent creativity of generated completions. Higher values will make output more random while lower values will make results more focused and deterministic. It is not recommended to modify `temperature` and `top_p` for the same completions request as the interaction of these two settings is difficult to predict.	0.8

也可以直接通过Model API直接调用

@RestController
public class ChatController {private final OpenAiChatModel chatModel;@Autowiredpublic ChatController(OpenAiChatModel chatModel) {this.chatModel = chatModel;}@GetMapping("/ai/generate")public Map<String,String> generate(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {return Map.of("generation", this.chatModel.call(message));}@GetMapping("/ai/generateStream")public Flux<ChatResponse> generateStream(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {Prompt prompt = new Prompt(new UserMessage(message));return this.chatModel.stream(prompt);}
}

也可以手动方式进行创建 Model 对象

OpenAiApi openAiApi = OpenAiApi.builder().apiKey("xxxx").build();
var openAiChatOptions = OpenAiChatOptions.builder().model("gpt-4o-mini").temperature(0.4).maxTokens(200).build();var chatModel = OpenAiChatModel.builder().openAiApi(openAiApi).defaultOptions(openAiChatOptions).build();ChatResponse response = chatModel.call(new Prompt("Generate the names of 50 famous pirates."));
String text = response.getResult().getOutput().getText();
System.out.println(text);

Embeddings Model API

嵌入是文本、图像或视频的数值表示，用于捕捉输入之间的关系。

嵌入的工作原理是将文本、图像和视频转换为浮点数数组（称为向量）。这些向量旨在捕捉文本、图像和视频的含义。嵌入数组的长度称为向量的维数。

通过计算两段文本的向量表示之间的数值距离，应用程序可以确定用于生成嵌入向量的对象之间的相似性。(距离越小，相关性越高；距离越大，相关性越低。)

Property	Description	Default
spring.ai.openai.embedding.enabled (Required and no longer valid)	Enable OpenAI embedding model.	true
spring.ai.model.embedding	Enable OpenAI embedding model.	openai
spring.ai.openai.embedding.base-url	Optional overrides the spring.ai.openai.base-url to provide embedding specific url	-
spring.ai.openai.chat.embeddings-path	The path to append to the base-url	`/v1/embeddings`
spring.ai.openai.embedding.api-key	Optional overrides the spring.ai.openai.api-key to provide embedding specific api-key	-
spring.ai.openai.embedding.options.model	The model to use	text-embedding-ada-002 (other options: text-embedding-3-large, text-embedding-3-small)

@RestController
public class EmbeddingController {private final EmbeddingModel embeddingModel;@Autowiredpublic EmbeddingController(EmbeddingModel embeddingModel) {this.embeddingModel = embeddingModel;}@GetMapping("/ai/embedding")public Map embed(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {EmbeddingResponse embeddingResponse = this.embeddingModel.embedForResponse(List.of(message));return Map.of("embedding", embeddingResponse);}
}

spring:ai:openai:api-key: xxxxxbase-url: https://api.openai.comembedding:options:model: text-embedding-v3

Vector Databases

向量数据库是一种特殊类型的数据库，在人工智能应用中发挥着至关重要的作用。

向量数据库的查询不同于传统的关系型数据库。它们执行相似性搜索，而不是精确匹配。当给定一个向量作为查询时，向量数据库会返回与查询向量“相似”的向量。

以milvs数据库为例： Milvus :: Spring AI Reference

public interface VectorStore extends DocumentWriter {default String getName() {return this.getClass().getSimpleName();}void add(List<Document> documents);default void accept(List<Document> documents) {this.add(documents);}void delete(List<String> idList);void delete(Filter.Expression filterExpression);default void delete(String filterExpression) {SearchRequest searchRequest = SearchRequest.builder().filterExpression(filterExpression).build();Filter.Expression textExpression = searchRequest.getFilterExpression();Assert.notNull(textExpression, "Filter expression must not be null");this.delete(textExpression);}@NullableList<Document> similaritySearch(SearchRequest request);@Nullabledefault List<Document> similaritySearch(String query) {return this.similaritySearch(SearchRequest.builder().query(query).build());}
}

Tool Calling

工具调用是 AI 应用中的一种常见模式，允许模型与一组 API 或工具交互，从而增强其功能。

尽管我们通常将工具调用称为模型功能，但实际上工具调用逻辑是由客户端应用程序提供的。

模型只能请求工具调用并提供输入参数，而应用程序负责根据输入参数执行工具调用并返回结果。模型永远无法访问任何作为工具提供的 API，这是一个至关重要的安全考虑因素。

Spring AI 提供了便捷的 API 来定义工具、解析来自模型的工具调用请求以及执行工具调用。

1、定义工具函数

public class DateTimeTools {@Tool(description = "Get the current date and time in the user's timezone")String getCurrentDateTime() {return LocalDateTime.now().atZone(LocaleContextHolder.getTimeZone().toZoneId()).toString();}}

2、构造chatClient对象时注册工具

@GetMapping("/ai")
String generation(String userInput) {return this.chatClient.prompt().user("What day is tomorrow?").tools(new DateTimeTools()) //标注要注册的工具.call().content();
}

注：大模型会自动根据用户问题的相关度来决定是否要调用工具。为了提升工具调用的准确性，定义工具名称、描述需要详细编写。

public @interface Tool {String name() default "";String description() default "";boolean returnDirect() default false;Class<? extends ToolCallResultConverter> resultConverter() default DefaultToolCallResultConverter.class;
}

大模型调用工具交互流程为：

Model Context Protocol (MCP)

模型上下文协议 (MCP) 是一种标准化协议，使 AI 模型能够以结构化的方式与外部工具和资源进行交互。它支持多种传输机制，从而提供跨不同环境的灵活性。

MCP Java SDK 提供了模型上下文协议的 Java 实现，支持通过同步和异步通信模式与 AI 模型和工具进行标准化交互。

Spring AI MCP 通过 Spring Boot 集成扩展了 MCP Java SDK，提供客户端和服务器启动器。

mcp应用市场：MCP Servers

Client Starters

spring-ai-starter-mcp-client - Core starter providing STDIO and HTTP-based SSE support
spring-ai-starter-mcp-client-webflux - WebFlux-based SSE transport implementation

Server Starters

spring-ai-starter-mcp-server - Core server with STDIO transport support
spring-ai-starter-mcp-server-webmvc - Spring MVC-based SSE transport implementation
spring-ai-starter-mcp-server-webflux - WebFlux-based SSE transport implementation

The MCP Client Boot Starter provides:

Management of multiple client instances
Automatic client initialization (if enabled)
Support for multiple named transports
Integration with Spring AI’s tool execution framework
Proper lifecycle management with automatic cleanup of resources when the application context is closed
Customizable client creation through customizers

Mcp Client 使用步骤

1、引入starter依赖

<dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-starter-mcp-client</artifactId>
</dependency>

2、配置mcp server相关参数

spring:ai:openai:api-key: xxxxbase-url: https://api.openai.comchat:options:model: gpt-4o-miniembedding:options:model: text-embedding-v3mcp:client:enabled: truename: my-mcp-clientversion: 1.0.0request-timeout: 30stoolcallback:enabled: true # 需要明确手动开启type: SYNC  # or ASYNC for reactive applicationssse:connections:server1:url: http://localhost:8080

3、将注册函数注入到模型对象中

@RestController
public class McpClientController {@Autowiredprivate ToolCallbackProvider toolCallbackProvider;@Autowiredprivate ChatClient chatClient;@GetMapping("/mcp-test")String test() {return this.chatClient.prompt().user("北京的天气如何?").tools(toolCallbackProvider.getToolCallbacks()) //标注要注册的工具.call().content();}}

Mcp Server 使用步骤

1、引入starter依赖

<dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-starter-mcp-server-webmvc</artifactId>
</dependency>

2、配置相关参数

spring:ai:mcp:server:name: webmvc-mcp-serverversion: 1.0.0type: SYNC

3、暴露工具

@Bean
public ToolCallbackProvider weatherTools(WeatherService weatherService) {return MethodToolCallbackProvider.builder().toolObjects(weatherService).build();
}

注：其实底层还是工具调用原理，只不过是引入了一种新的通信协议来支持发现远程的服务接口而已。

Spring AI MCP Server Starter提供了两种实现MCP服务端的方式：基于stdio的实现和基于SSE的实现。基于stdio的实现适用于嵌入式场景，而基于SSE的实现适用于独立服务部署。

通过使用@Tool注解和@ToolParameter注解，可以轻松地将普通的Java方法转换为MCP工具，使其可以被MCP客户端发现和调用。Spring Boot的自动配置机制使得MCP服务端的开发变得简单高效。

四、Spring AI Alibaba

Spring AI Alibaba 开源项目基于 Spring AI 构建，是阿里云通义系列模型及服务在 Java AI 应用开发领域的最佳实践，提供高层次的 AI API 抽象与云原生基础设施集成方案，帮助开发者快速构建 AI 应用。

文档：Spring AI Alibaba 概述-阿里云Spring AI Alibaba官网官网

简单实例

1、引入 spring-ai-alibaba-starter：

<dependency><groupId>com.alibaba.cloud.ai</groupId><artifactId>spring-ai-alibaba-starter</artifactId><version>1.0.0-M6.1</version>
</dependency>

2、配置 application.yml：

spring:ai:dashscope:api-key: sk-xxxx

3、初始化client调用API

private final ChatClient dashScopeChatClient;// 也可以使用如下的方式注入 ChatClientpublic HelloworldController(ChatClient.Builder chatClientBuilder) {this.dashScopeChatClient = chatClientBuilder.defaultSystem(DEFAULT_PROMPT)// 实现 Chat Memory 的 Advisor// 在使用 Chat Memory 时，需要指定对话 ID，以便 Spring AI 处理上下文。.defaultAdvisors(new MessageChatMemoryAdvisor(new InMemoryChatMemory()))// 实现 Logger 的 Advisor.defaultAdvisors(new SimpleLoggerAdvisor())// 设置 ChatClient 中 ChatModel 的 Options 参数.defaultOptions(DashScopeChatOptions.builder().withTopP(0.7).build()).build();}

DashScope 平台

灵积通过灵活、易用的模型 API 服务，让各种模态模型的能力，都能方便的为 AI 开发者所用。通过灵积 API，开发者不仅可以直接集成大模型的强大能力，也可以对模型进行训练微调，实现模型定制化。

使用云平台方式，可选择的模型更多。

五、最佳实践

1、支持对话记忆

”大模型的对话记忆”这一概念，根植于人工智能与自然语言处理领域，特别是针对具有深度学习能力的大型语言模型而言，它指的是模型在与用户进行交互式对话过程中，能够追踪、理解并利用先前对话上下文的能力。此机制使得大模型不仅能够响应即时的输入请求，还能基于之前的交流内容能够在对话中记住先前的对话内容，并根据这些信息进行后续的响应。这种记忆机制使得模型能够在对话中持续跟踪和理解用户的意图和上下文，从而实现更自然和连贯的对话。

chatClientBuilder
.defaultSystem(DEFAULT_PROMPT)// 实现 Chat Memory 的 Advisor// 在使用 Chat Memory 时，需要指定对话 ID，以便 Spring AI 处理上下文。.defaultAdvisors(new MessageChatMemoryAdvisor(new InMemoryChatMemory()))// 实现 Logger 的 Advisor.defaultAdvisors(new SimpleLoggerAdvisor())// 设置 ChatClient 中 ChatModel 的 Options 参数.defaultOptions(DashScopeChatOptions.builder().withTopP(0.7).build()).build();ChatResponse response = chatClient.prompt().user("我想去新疆").advisors(spec -> spec.param(CHAT_MEMORY_CONVERSATION_ID_KEY, conversantId) // 指定会话ID.param(CHAT_MEMORY_RETRIEVE_SIZE_KEY, 10)).call().chatResponse();

当然，开发者也可以自行实现ChatMemory基于类似于文件、Redis等方式进行上下文内容的存储和记录。