导言:一山不容二虎?
老话说得好,“一山不容二虎”,但在Spring AI的世界里,我们偏要让OpenAI和Ollama这两只“大模型老虎”和平共处!
本篇通过源码浅析来揭秘Spring AI 的自动装配以及与大模型交互链路,解决在一个应用中同时存在多个大模型调用实例以及背后的ChatClient.Builder
、ChatModel
、Advisor
等核心组件是如何协作的。
一、从ChatClient.Builder说起 - 它从哪儿来?
回顾以前的代码示例,通过类的构造方法自动注入ChatClient.Builder构造器,由构造器创建ChatClient,那ChatClient.Builder是从哪里来的呢?
@Slf4j
@RestController
public class HelloController {private ChatClient chatClient;public HelloController(ChatClient.Builder builder) {this.chatClient = builder.build();}@GetMapping("/hello")public String hello(@RequestParam(value = "input", defaultValue = "讲一个笑话") String input) {Prompt prompt = new Prompt(input);return chatClient.prompt(prompt).call().content();}
}
自动装配引入
Spring Boot通过AutoConfiguration.imports
申明自动装配的全限定类名, 在Spring AI工程中,会引入spring-ai-spring-boot-autoconfigure模块,该模块完成Spring AI自动装配
<dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-spring-boot-autoconfigure</artifactId><version>${project.version}</version></dependency>
ChatClientAutoConfiguration
观察ChatClientAutoConfiguration
类,只要ChatClient
对应的Class存在即生效
类中申明ChatClient.Builder
Bean,该Bean的构造依赖ChatClientBuilderConfigurer
、ChatModel
、ObservationRegistry
、ObservationConvention
四个对象:
ChatClientBuilderConfigurer
在提前完成定义ObservationRegistry
、ObservationConvention
是用来做性能指标监测的,归属于Micrometer
包,Micrometer
是一个用于应用程序指标(Metrics
)(如计数器、计时器、仪表盘、分布摘要等)收集和发布的Java库,其核心作用是为Java应用提供统一的度量指标接口,并支持对接多种监控系统(如 Prometheus、Graphite、InfluxDB 等),在分析Spring AI的代码实现时,建议先直接忽略这块的逻辑(分析后续的代码时我也会默认跳过这些代码的解读)- 核心就是依赖
ChatModel
,当ChatModel
存在时则自然创建ChatClient.Builder
实例,到这里就快速明白了ChatClient.Builder
的由来,ChatModel
放在后面再展开 ;(注意在方法上存在注解@ConditionalOnMissingBean
标识,只要开发手工定义了同类型的Bean,这块代码就不会执行 )
@Bean@ConditionalOnMissingBeanChatClientBuilderConfigurer chatClientBuilderConfigurer(ObjectProvider<ChatClientCustomizer> customizerProvider) {ChatClientBuilderConfigurer configurer = new ChatClientBuilderConfigurer();configurer.setChatClientCustomizers(customizerProvider.orderedStream().toList());return configurer;}@Bean@Scope("prototype")@ConditionalOnMissingBeanChatClient.Builder chatClientBuilder(ChatClientBuilderConfigurer chatClientBuilderConfigurer, ChatModel chatModel,ObjectProvider<ObservationRegistry> observationRegistry,ObjectProvider<ChatClientObservationConvention> observationConvention) {ChatClient.Builder builder = ChatClient.builder(chatModel,observationRegistry.getIfUnique(() -> ObservationRegistry.NOOP),observationConvention.getIfUnique(() -> null));return chatClientBuilderConfigurer.configure(builder);}
在chatClientBuilder
方法中调用ChatClient.builder
最终构造DefaultChatClientBuilder
实例,也就是说ChatClient.builder
的本质是DefaultChatClientBuilder
;
static Builder builder(ChatModel chatModel, ObservationRegistry observationRegistry,@Nullable ChatClientObservationConvention customObservationConvention) {Assert.notNull(chatModel, "chatModel cannot be null");Assert.notNull(observationRegistry, "observationRegistry cannot be null");return new DefaultChatClientBuilder(chatModel, observationRegistry, customObservationConvention);}
小结
ChatClient.builder
由自动装配机制申明,本质是DefaultChatClientBuilder
,核心依赖ChatModel的注入做实例化。
二、ChatModel的构建
OpenAiAutoConfiguration
由于工程调用的是OpenAI的大模型API,需要找到org.springframework.ai.autoconfigure
包下OpenAiAutoConfiguration.java
,其定义如下:
@AutoConfiguration(after = { RestClientAutoConfiguration.class, WebClientAutoConfiguration.class,SpringAiRetryAutoConfiguration.class, ToolCallingAutoConfiguration.class })
@ConditionalOnClass(OpenAiApi.class)
@EnableConfigurationProperties({ OpenAiConnectionProperties.class, OpenAiChatProperties.class,OpenAiEmbeddingProperties.class, OpenAiImageProperties.class, OpenAiAudioTranscriptionProperties.class,OpenAiAudioSpeechProperties.class, OpenAiModerationProperties.class })
@ImportAutoConfiguration(classes = { SpringAiRetryAutoConfiguration.class, RestClientAutoConfiguration.class,WebClientAutoConfiguration.class, ToolCallingAutoConfiguration.class })
public class OpenAiAutoConfiguration {//省略...
}
- 注解
@ConditionalOnClass(OpenAiApi.class)
:标识只要存在命名为org.springframework.ai.openai.api.OpenAiApi
的Class类后就会自动装配的OpenAiAutoConfiguration
OpenAiApi
是在spring-ai-openai
中定义的,当pom.xml
中加入spring-ai-openai-spring-boot-starter
时对应的Class存在,整个自动配置生效
<dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-openai-spring-boot-starter</artifactId></dependency>
OpenAiAutoConfiguration
中定义OpenAiChatModel
对应的Bean
- 依赖
OpenAiApi
、RetryTemplate
、RestClient
、WebClient
等(还有其他几个,暂时先不展开)实例来构建OpenAiChatModel
实例; RestClient
、WebClient
核心是为了发起网络请求RetryTemplate
定义发起请求的重试策略模板,在SpringAiRetryAutoConfiguration
中自动申明对应的Bean
,因此可以通过SpringAiRetryProperties
定义的属性来设置各种重试策略,OpenAiApi
封装了OpenAI官方接口的各种调用;
- 依赖
@Bean@ConditionalOnMissingBean@ConditionalOnProperty(prefix = OpenAiChatProperties.CONFIG_PREFIX, name = "enabled", havingValue = "true",matchIfMissing = true)public OpenAiChatModel openAiChatModel(OpenAiConnectionProperties commonProperties,OpenAiChatProperties chatProperties, ObjectProvider<RestClient.Builder> restClientBuilderProvider,ObjectProvider<WebClient.Builder> webClientBuilderProvider, ToolCallingManager toolCallingManager,RetryTemplate retryTemplate, ResponseErrorHandler responseErrorHandler,ObjectProvider<ObservationRegistry> observationRegistry,ObjectProvider<ChatModelObservationConvention> observationConvention) {var openAiApi = openAiApi(chatProperties, commonProperties,restClientBuilderProvider.getIfAvailable(RestClient::builder),webClientBuilderProvider.getIfAvailable(WebClient::builder), responseErrorHandler, "chat");var chatModel = OpenAiChatModel.builder().openAiApi(openAiApi).defaultOptions(chatProperties.getOptions()).toolCallingManager(toolCallingManager).retryTemplate(retryTemplate).observationRegistry(observationRegistry.getIfUnique(() -> ObservationRegistry.NOOP)).build();observationConvention.ifAvailable(chatModel::setObservationConvention);return chatModel;}
OpenAiChatModel
实现接口ChatModel
接口, 那申明OpenAiChatModel
的Bean
不就是前面需要的ChatModel
嘛~
public class OpenAiChatModel extends AbstractToolCallSupport implements ChatModel {
小结
OpenAiAutoConfiguration
自动装配OpenAiChatModel
,OpenAiChatModel
就是ChatModel
。
三、一山容二虎
解答开篇:想要一个工程里共存多个大模型的调用实例,只需构建多个ChatClient
,每个ChatClient
依赖不同大模型的ChatModel
;
实践一下
引入openai
和ollama
依赖来尝试一下
<dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-openai-spring-boot-starter</artifactId></dependency>
<dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-ollama-spring-boot-starter</artifactId></dependency>
直接启动应用会出现如下的错误,说找到两个ChatModel
类型的Bean ollamaChatModel
、openAiChatModel
14:02:44.565 [main] DEBUG o.s.b.d.LoggingFailureAnalysisReporter -- Application failed to start due to an exception
org.springframework.beans.factory.NoUniqueBeanDefinitionException: No qualifying bean of type 'org.springframework.ai.chat.model.ChatModel' available: expected single matching bean but found 2: ollamaChatModel,openAiChatModelat org.springframework.beans.factory.config.DependencyDescriptor.resolveNotUnique(DependencyDescriptor.java:218)
从上面源码部分就知道ChatModel
的构建是:只要没有对应类型的ChatModel
就会自动生成该类型的Bean
(Ollama的自动配置也是类似逻辑);ChatClient.Builder
需要唯一的 ChatModel
,但现在有两个自然就出错了(两只老虎打架了~)
private ChatClient chatClient;public HelloController(ChatClient.Builder builder) {this.chatClient = builder.build();}
那解决错误的办法很简单,如下面代码所示,手动构建多个ChatClient
,分别依赖不同的大模型
@Configuration
public class BeanConfig {@Bean("deepSeekChatClient")public ChatClient deepSeekChatClient(OpenAiChatModel openAiChatModel) {return ChatClient.create(openAiChatModel);}@Bean("ollamaChatClient")public ChatClient ollamaChatClient(OllamaChatModel ollamaChatModel) {return ChatClient.create(ollamaChatModel);}}
在application.properties
中配置好两个模型的调用参数
# ollama run llama3.1:8b
spring.ai.ollama.chat.model=llama3.1:8b
spring.ai.ollama.base-url=http://localhost:11434spring.ai.openai.api-key=sk-xxx
spring.ai.openai.base-url=https://api.deepseek.com
spring.ai.openai.chat.options.model=deepseek-chat
ChatClient按名称注入,启动代码调用,发现能正常工作。
public class HelloController {@Resourceprivate ChatClient deepSeekChatClient;@Resourceprivate ChatClient ollamaChatClient;@GetMapping("/hello/deepseek")public String deepseek(@RequestParam(value = "input", defaultValue = "讲一个笑话") String input) {Prompt prompt = new Prompt(input);return deepSeekChatClient.prompt(prompt).call().content();}@GetMapping("/hello/ollama")public String ollama(@RequestParam(value = "input", defaultValue = "讲一个笑话") String input) {Prompt prompt = new Prompt(input);return ollamaChatClient.prompt(prompt).call().content();}
}
四、深入虎穴,挖掘调用链
阅读代码的氛围都到这里了, 那就继续往下看看在发起调用时怎么和底层大模型联动的吧
4.1 ChatClient的构建
DefaultChatClientBuilder
@Bean("deepSeekChatClient")public ChatClient deepSeekChatClient(OpenAiChatModel openAiChatModel) {return ChatClient.create(openAiChatModel);}
先从ChatClient
的构建开始,它由ChatClient.create(openAiChatModel)
返回
static ChatClient create(ChatModel chatModel) {return create(chatModel, ObservationRegistry.NOOP);}
create
方法最终会调用new DefaultChatClientBuilder()
实现一个默认的构造器,通过调用构造器的build
方法返回ChatClient
(到这ChatClient
已完成构建,ChatClient
=DefaultChatClientBuilder.build())
static Builder builder(ChatModel chatModel, ObservationRegistry observationRegistry,@Nullable ChatClientObservationConvention customObservationConvention) {Assert.notNull(chatModel, "chatModel cannot be null");Assert.notNull(observationRegistry, "observationRegistry cannot be null");return new DefaultChatClientBuilder(chatModel, observationRegistry, customObservationConvention);}
DefaultChatClientBuilder
的构造方法中会创建默认的大模型调用请求规范对象DefaultChatClientRequestSpec
,以xxxSpec结尾的类可以简单理解为Spring AI定义的xx类型的规格实体,本质就是把相关实体的属性聚类在一起,后面还会遇到类型的实体定义,所以提前说明清楚
public DefaultChatClientBuilder(ChatModel chatModel, ObservationRegistry observationRegistry,@Nullable ChatClientObservationConvention customObservationConvention) {Assert.notNull(chatModel, "the " + ChatModel.class.getName() + " must be non-null");Assert.notNull(observationRegistry, "the " + ObservationRegistry.class.getName() + " must be non-null");this.defaultRequest = new DefaultChatClientRequestSpec(chatModel, null, Map.of(), null, Map.of(), List.of(),List.of(), List.of(), List.of(), null, List.of(), Map.of(), observationRegistry,customObservationConvention, Map.of());}
DefaultChatClientRequestSpec
public DefaultChatClientRequestSpec(ChatModel chatModel, @Nullable String userText,Map<String, Object> userParams, @Nullable String systemText, Map<String, Object> systemParams,List<FunctionCallback> functionCallbacks, List<Message> messages, List<String> functionNames,List<Media> media, @Nullable ChatOptions chatOptions, List<Advisor> advisors,Map<String, Object> advisorParams, ObservationRegistry observationRegistry,@Nullable ChatClientObservationConvention customObservationConvention,Map<String, Object> toolContext) {//省略一些代码this.chatModel = chatModel;this.chatOptions = chatOptions != null ? chatOptions.copy(): (chatModel.getDefaultOptions() != null) ? chatModel.getDefaultOptions().copy() : null;this.userText = userText;this.userParams.putAll(userParams);this.systemText = systemText;this.systemParams.putAll(systemParams);this.functionNames.addAll(functionNames);this.functionCallbacks.addAll(functionCallbacks);this.messages.addAll(messages);this.media.addAll(media);this.advisors.addAll(advisors);this.advisorParams.putAll(advisorParams);this.observationRegistry = observationRegistry;this.customObservationConvention = customObservationConvention != null ? customObservationConvention: DEFAULT_CHAT_CLIENT_OBSERVATION_CONVENTION;this.toolContext.putAll(toolContext);// @formatter:off// At the stack bottom add the non-streaming and streaming model call advisors.// They play the role of the last advisor in the around advisor chain.this.advisors.add(new CallAroundAdvisor() {@Overridepublic String getName() {return CallAroundAdvisor.class.getSimpleName();}@Overridepublic int getOrder() {return Ordered.LOWEST_PRECEDENCE;}@Overridepublic AdvisedResponse aroundCall(AdvisedRequest advisedRequest, CallAroundAdvisorChain chain) {return new AdvisedResponse(chatModel.call(advisedRequest.toPrompt()), Collections.unmodifiableMap(advisedRequest.adviseContext()));}});this.advisors.add(new StreamAroundAdvisor() {@Overridepublic String getName() {return StreamAroundAdvisor.class.getSimpleName();}@Overridepublic int getOrder() {return Ordered.LOWEST_PRECEDENCE;}@Overridepublic Flux<AdvisedResponse> aroundStream(AdvisedRequest advisedRequest, StreamAroundAdvisorChain chain) {return chatModel.stream(advisedRequest.toPrompt()).map(chatResponse -> new AdvisedResponse(chatResponse, Collections.unmodifiableMap(advisedRequest.adviseContext()))).publishOn(Schedulers.boundedElastic()); // TODO add option to disable.}});// @formatter:onthis.aroundAdvisorChainBuilder = DefaultAroundAdvisorChain.builder(observationRegistry).pushAll(this.advisors);}
在DefaultChatClientRequestSpec
构造时接收了一大堆参数,很多都是默认可被覆盖的;这里要重点留意三个点
chatOptions
是发起调用的可选参数项实体,在赋值时调用copy()
方法,规避了后续对实体内容的修改会印象到初始的设置; 这也是平时我们在写代码时参数传递深了以后不知道哪个环节就改写参数,带来潜在的隐患;这点值得学习
this.chatOptions = chatOptions != null ? chatOptions.copy()
: (chatModel.getDefaultOptions() != null) ? chatModel.getDefaultOptions().copy() : null;
Advisor
机制,翻译成中文可以理解为顾问机制,所有的Advisor
要实现接口org.springframework.ai.chat.client.advisor.api.Advisor
,添加CallAroundAdvisor
和StreamAroundAdvisor
两个Advisor
,CallAroundAdvisor
是执行ChatModel#call(Prompt)
的环绕Advisor
,StreamAroundAdvisor
是执行流式请求时的环绕Advisor
;- 可以类比AOP代理机制来理解
Advisor
; 在CallAroundAdvisor
中的aroundCall
通过执行chatModel.call()
来返回一个AdvisedResponse
对象,这里只是定义子类的实现,不是真正的代码调用;StreamAroundAdvisor
也可以类似理解
this.advisors.add(new CallAroundAdvisor() {//省略部分代码@Overridepublic AdvisedResponse aroundCall(AdvisedRequest advisedRequest, CallAroundAdvisorChain chain) {return new AdvisedResponse(chatModel.call(advisedRequest.toPrompt()), Collections.unmodifiableMap(advisedRequest.adviseContext()));}});this.advisors.add(new StreamAroundAdvisor() { //省略部分代码@Overridepublic Flux<AdvisedResponse> aroundStream(AdvisedRequest advisedRequest, StreamAroundAdvisorChain chain) {return chatModel.stream(advisedRequest.toPrompt()).map(chatResponse -> new AdvisedResponse(chatResponse, Collections.unmodifiableMap(advisedRequest.adviseContext()))).publishOn(Schedulers.boundedElastic()); // TODO add option to disable.}});
DefaultAroundAdvisorChain
默认环绕Advisor链,内部通过两个双端队列Deque<CallAroundAdvisor> callAroundAdvisors
和Deque<StreamAroundAdvisor> streamAroundAdvisors
来存储CallAroundAdvisor
和StreamAroundAdvisor
类型的Advisor
小结
通过一系列默认参数(包含ChatModel
)构建DefaultChatClientRequestSpec
,基于DefaultChatClientRequestSpec
生成DefaultChatClientBuilder
,最后由DefaultChatClientBuilder#build()
来创建ChatClient
4.2 ChatClient.prompt(prompt).call().content()
prompt()
通过deepSeekChatClient.prompt(prompt)
构造大模型请求规格对象ChatClientRequestSpec
,该对象利用前面提到的默认请求对象,再叠加提示词Prompt
里用户设置的Option
和Instruction
来覆盖默认对象;
public ChatClientRequestSpec prompt(Prompt prompt) {Assert.notNull(prompt, "prompt cannot be null");DefaultChatClientRequestSpec spec = new DefaultChatClientRequestSpec(this.defaultChatClientRequest);// Optionsif (prompt.getOptions() != null) {spec.options(prompt.getOptions());}// Messagesif (prompt.getInstructions() != null) {spec.messages(prompt.getInstructions());}return spec;}
call()
继续调用ChatClient#call()
方法构建调用响应规格对象CallResponseSpec
,最终来到content()
方法
interface CallResponseSpec {@Nullable<T> T entity(ParameterizedTypeReference<T> type);@Nullable<T> T entity(StructuredOutputConverter<T> structuredOutputConverter);@Nullable<T> T entity(Class<T> type);@NullableChatResponse chatResponse();@NullableString content();<T> ResponseEntity<ChatResponse, T> responseEntity(Class<T> type);<T> ResponseEntity<ChatResponse, T> responseEntity(ParameterizedTypeReference<T> type);<T> ResponseEntity<ChatResponse, T> responseEntity(StructuredOutputConverter<T> structuredOutputConverter);}
content()
content()
内部执行ChatResponse chatResponse = doGetChatResponse()
private ChatResponse doGetObservableChatResponse(DefaultChatClientRequestSpec inputRequest,@Nullable String formatParam) {ChatClientObservationContext observationContext = ChatClientObservationContext.builder().withRequest(inputRequest).withFormat(formatParam).withStream(false).build();var observation = ChatClientObservationDocumentation.AI_CHAT_CLIENT.observation(inputRequest.getCustomObservationConvention(), DEFAULT_CHAT_CLIENT_OBSERVATION_CONVENTION,() -> observationContext, inputRequest.getObservationRegistry());return observation.observe(() -> doGetChatResponse(inputRequest, formatParam, observation));}
忽略Observable
这块的逻辑,只看关键实现路径; 因此直接定位到 doGetChatResponse()
方法的执行
private ChatResponse doGetChatResponse(DefaultChatClientRequestSpec inputRequestSpec,@Nullable String formatParam, Observation parentObservation) {AdvisedRequest advisedRequest = toAdvisedRequest(inputRequestSpec, formatParam);// Apply the around advisor chain that terminates with the last model call// advisor.AdvisedResponse advisedResponse = inputRequestSpec.aroundAdvisorChainBuilder.build().nextAroundCall(advisedRequest);return advisedResponse.response();}
doGetChatResponse
内部核心逻辑为构建请求、发起链式调用、返回结果三步 ;
toAdvisedRequest()
主要用来将DefaultChatClientRequestSpec
发起请求对象转换为AdvisedRequest
这是一种典型的为了解耦做的实体转换操作;构造完请求后发起nextAroundCall
的调用
public AdvisedResponse nextAroundCall(AdvisedRequest advisedRequest) {if (this.callAroundAdvisors.isEmpty()) {throw new IllegalStateException("No AroundAdvisor available to execute");}var advisor = this.callAroundAdvisors.pop();var observationContext = AdvisorObservationContext.builder().advisorName(advisor.getName()).advisorType(AdvisorObservationContext.Type.AROUND).advisedRequest(advisedRequest).advisorRequestContext(advisedRequest.adviseContext()).order(advisor.getOrder()).build();return AdvisorObservationDocumentation.AI_ADVISOR.observation(null, DEFAULT_OBSERVATION_CONVENTION, () -> observationContext, this.observationRegistry).observe(() -> advisor.aroundCall(advisedRequest, this));}
- 忽略
observation
相关的逻辑,代码简化为
var advisor = this.callAroundAdvisors.pop();
advisor.aroundCall(advisedRequest, this)
先从栈顶弹出advisor
然后执行AdvisedResponse aroundCall(AdvisedRequest advisedRequest, CallAroundAdvisorChain chain);
方法,会执行前面添加的匿名实现类,从而触发了chatModel.call
的真正执行;
this.advisors.add(new CallAroundAdvisor() {//省略部分代码@Overridepublic AdvisedResponse aroundCall(AdvisedRequest advisedRequest, CallAroundAdvisorChain chain) {return new AdvisedResponse(chatModel.call(advisedRequest.toPrompt()), Collections.unmodifiableMap(advisedRequest.adviseContext()));}});
Advisor机制
这里先不展开chatModel.call
的逻辑,先看Spring AI的Advisor机制
aroundCall
方法它接收两个参数:一个是请求对象,一个是CallAroundAdvisorChain
链;
疑问:可能就有人会问了,这里就一次栈的pop()
操作,也没有发现for
循环之类的链式调用啊?
答:Spring AI的链式调用比较灵活,在调用aroundCall
方法时已经把CallAroundAdvisorChain
链对象实例已经传递出来了,至于要不要继续链式的执行下去,让用户自行决策;
看一个例子, 假设想要在调用chatModel.call
之前和之后打印下日志,可以定义一个CusLoggerAdvisor
public class CusLoggerAdvisor implements CallAroundAdvisor {private AdvisedRequest before(AdvisedRequest request) {logger.debug("request: {}", this.requestToString.apply(request));return request;}private void after(AdvisedResponse advisedResponse) {logger.debug("response: {}", this.responseToString.apply(advisedResponse.response()));}@Overridepublic AdvisedResponse aroundCall(AdvisedRequest advisedRequest, CallAroundAdvisorChain chain) {advisedRequest = before(advisedRequest);AdvisedResponse advisedResponse = chain.nextAroundCall(advisedRequest);after(advisedResponse);return advisedResponse;}
}
然后将CusLoggerAdvisor
添加至栈顶,在执行时优先执行CusLoggerAdvisor#aroundCall
,则会执行完before
方法后调用chain.nextAroundCall(advisedRequest)
去执行链中的下一个advisor
方法,然后再执行after
。nextAroundCall
执行与否交由用户自行决策
chatModel.call()
再继续回来chatModel.call
的逻辑,chatModel
就是 OpenAiChatModel
对象,其call()
定义如下,先依据各种输入参数再次构建提示词对象,然后发起请求internalCall
@Overridepublic ChatResponse call(Prompt prompt) {// Before moving any further, build the final request Prompt,// merging runtime and default options.Prompt requestPrompt = buildRequestPrompt(prompt);return this.internalCall(requestPrompt, null);}
internalCall
代码逻辑简化一下
public ChatResponse internalCall(Prompt prompt, ChatResponse previousChatResponse) {ChatCompletionRequest request = createRequest(prompt, false);ResponseEntity<ChatCompletion> completionEntity = this.retryTemplate.execute(ctx -> this.openAiApi.chatCompletionEntity(request, getAdditionalHttpHeaders(prompt)));var chatCompletion = completionEntity.getBody();List<Choice> choices = chatCompletion.choices();List<Generation> generations = choices.stream().map(choice -> {// @formatter:offMap<String, Object> metadata = Map.of("id", chatCompletion.id() != null ? chatCompletion.id() : "","role", choice.message().role() != null ? choice.message().role().name() : "","index", choice.index(),"finishReason", choice.finishReason() != null ? choice.finishReason().name() : "","refusal", StringUtils.hasText(choice.message().refusal()) ? choice.message().refusal() : "");// @formatter:onreturn buildGeneration(choice, metadata, request);}).toList();RateLimit rateLimit = OpenAiResponseHeaderExtractor.extractAiResponseHeaders(completionEntity);// Current usageOpenAiApi.Usage usage = completionEntity.getBody().usage();Usage currentChatResponseUsage = usage != null ? getDefaultUsage(usage) : new EmptyUsage();Usage accumulatedUsage = UsageUtils.getCumulativeUsage(currentChatResponseUsage, previousChatResponse);ChatResponse chatResponse = new ChatResponse(generations,from(completionEntity.getBody(), rateLimit, accumulatedUsage));
}
通过构造ChatCompletionRequest
来调用openAiApi.chatCompletionEntity()
,调用带了retryTemplate
这也是Spring AI对各种重试策略生效的原因所在了,执行完成以后就是各种解析数据格式返回ChatResponse
对象(其实后面还有Tool的执行等逻辑,本篇不展开)
this.openAiApi.chatCompletionEntity
的实现就是通过restClient
发起网络请求,到这里一次请求链路就完成了。
public ResponseEntity<ChatCompletion> chatCompletionEntity(ChatCompletionRequest chatRequest,MultiValueMap<String, String> additionalHttpHeader) {Assert.notNull(chatRequest, "The request body can not be null.");Assert.isTrue(!chatRequest.stream(), "Request must set the stream property to false.");Assert.notNull(additionalHttpHeader, "The additional HTTP headers can not be null.");return this.restClient.post().uri(this.completionsPath).headers(headers -> headers.addAll(additionalHttpHeader)).body(chatRequest).retrieve().toEntity(ChatCompletion.class);}
小结
通过content()的执行触发Advisor
调用OpenAiChatModel#call()
,最终把请求通过OpenAiApi
发送到大模型
结语:通过Debug看代码效率更高
对整个流程进行一下小结:
ChatClient.Builder
依赖ChatModel
,而ChatModel
由各个大模型的AutoConfiguration
提供;- 通过手动定义多个 ChatClient 解决 NoUniqueBeanDefinitionException;
- Advisor 机制类似 AOP,可以在调用前后插入逻辑,可以自定义 Advisor 实现日志、缓存、限流等功能;
- chatModel.call()最终把请求转发到各个大模型的API调用规范实现上。
看完代码以后期望你能
- 自行读懂流式请求的操作(核心是Flux的操作,在第一章给出了示例)
- 灵活的配置Spring AI的各种参数
- 自定义Advisor的使用
- 体会Spring AI对模型的抽象与解耦的技巧尝试模仿实现一个类OpenAI的starter来调用其他大模型