问题:
Exception in thread "rtsp-consumer-3" org.springframework.data.redis.RedisConnectionFailureException: Unable to connect to Redis; nested exception is io.lettuce.core.RedisConnectionException: Unable to connect to localhost:6379at org.springframework.data.redis.connection.lettuce.LettuceConnectionFactory$ExceptionTranslatingConnectionProvider.translateException(LettuceConnectionFactory.java:1689)at org.springframework.data.redis.connection.lettuce.LettuceConnectionFactory$ExceptionTranslatingConnectionProvider.getConnection(LettuceConnectionFactory.java:1597)at org.springframework.data.redis.connection.lettuce.LettuceConnection.doGetAsyncDedicatedConnection(LettuceConnection.java:1006)at org.springframework.data.redis.connection.lettuce.LettuceConnection.getOrCreateDedicatedConnection(LettuceConnection.java:1069)at org.springframework.data.redis.connection.lettuce.LettuceConnection.getAsyncDedicatedConnection(LettuceConnection.java:990)at org.springframework.data.redis.connection.lettuce.LettuceStreamCommands.getAsyncDedicatedConnection(LettuceStreamCommands.java:395)at org.springframework.data.redis.connection.lettuce.LettuceStreamCommands.xReadGroup(LettuceStreamCommands.java:346)at org.springframework.data.redis.connection.DefaultedRedisConnection.xReadGroup(DefaultedRedisConnection.java:592)at org.springframework.data.redis.core.DefaultStreamOperations$4.inRedis(DefaultStreamOperations.java:310)at org.springframework.data.redis.core.DefaultStreamOperations$RecordDeserializingRedisCallback.doInRedis(DefaultStreamOperations.java:387)at org.springframework.data.redis.core.DefaultStreamOperations$RecordDeserializingRedisCallback.doInRedis(DefaultStreamOperations.java:382)at org.springframework.data.redis.core.RedisTemplate.execute(RedisTemplate.java:222)at org.springframework.data.redis.core.RedisTemplate.execute(RedisTemplate.java:189)at org.springframework.data.redis.core.AbstractOperations.execute(AbstractOperations.java:96)at org.springframework.data.redis.core.DefaultStreamOperations.read(DefaultStreamOperations.java:305)at com.ruoyi.vedioFrame.utils.RedisStreamOperations.readGroup(RedisStreamOperations.java:70)at com.ruoyi.vedioFrame.service.impl.StreamConsumerService.consumeFrames(StreamConsumerService.java:200)at com.ruoyi.vedioFrame.service.impl.StreamConsumerService.lambda$null$0(StreamConsumerService.java:108)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at java.lang.Thread.run(Thread.java:745)
Caused by: io.lettuce.core.RedisConnectionException: Unable to connect to localhost:6379at io.lettuce.core.RedisConnectionException.create(RedisConnectionException.java:78)at io.lettuce.core.RedisConnectionException.create(RedisConnectionException.java:56)at io.lettuce.core.AbstractRedisClient.getConnection(AbstractRedisClient.java:330)at io.lettuce.core.RedisClient.connect(RedisClient.java:216)at org.springframework.data.redis.connection.lettuce.StandaloneConnectionProvider.lambda$getConnection$1(StandaloneConnectionProvider.java:115)at java.util.Optional.orElseGet(Optional.java:267)at org.springframework.data.redis.connection.lettuce.StandaloneConnectionProvider.getConnection(StandaloneConnectionProvider.java:115)at org.springframework.data.redis.connection.lettuce.LettuceConnectionFactory$ExceptionTranslatingConnectionProvider.getConnection(LettuceConnectionFactory.java:1595)... 19 more
Caused by: io.netty.channel.AbstractChannel$AnnotatedSocketException: Address already in use: no further information: localhost/127.0.0.1:6379
Caused by: java.net.BindException: Address already in use: no further informationat sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:337)at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:776)at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)at java.lang.Thread.run(Thread.java:745)
解决
根据代码和现象分析,根本原因可能是处理时间过长导致Redis消息积压触发超时异常。以下是详细分析和解决方案:
🔍 问题根源
-
同步阻塞模型:
frameProcessor.process()
调用外部接口时存在网络延迟,导致消息处理时间超过Redis的BLOCK
超时(2000ms)。- 未及时ACK会导致消息重新进入Pending状态,积累后触发
READONLY
错误或消费者组失效。
-
Thread.sleep(1000)加剧阻塞:
- 即使没有消息,每次循环强制休眠1秒,降低消费吞吐量,无法及时处理积压消息。
-
异常未妥善处理:
- 注释掉的try-catch导致处理过程中的异常(如网络超时)未被捕获,中断循环后Redis连接未正确恢复。
🛠️ 解决方案
1. 异步处理 + 提前ACK(关键步骤)
将消息处理与ACK分离,处理前立即ACK避免消息积压,异步处理业务逻辑:
ExecutorService executor = Executors.newCachedThreadPool(); // 全局线程池while (!Thread.currentThread().isInterrupted()) {List<MapRecord<String, Object, Object>> records = redisStreamOperations.readGroup(...);if (records != null) {for (MapRecord<String, Object, Object> record : records) {// 提前ACK释放消息redisStreamOperations.ack(streamKey, groupName, record.getId());// 提交异步任务处理业务executor.submit(() -> {try {String strdata = (String) record.getValue().get("frameData");byte[] frameData = Base64.getDecoder().decode(strdata);BufferedImage image = ImageIO.read(new ByteArrayInputStream(frameData));frameProcessor.process(image, streamId, commonQueues, regions);} catch (Exception e) {log.error("处理帧失败: {}", e.getMessage());}});}}// 移除Thread.sleep(1000) 提升吞吐量
}
2. 调整Redis消费者组参数
增加XGROUP CREATE
时的BLOCK
和IDLE
超时:
XGROUP CREATE stream:1234 mygroup $ MKSTREAM BLOCK 5000 IDLE 30000
3. 优化Redis连接配置
确保RedisTemplate的读写超时大于业务处理最长时间:
@Bean
public RedisTemplate<String, Object> redisTemplate() {RedisTemplate<String, Object> template = new RedisTemplate<>();template.setConnectionFactory(redisConnectionFactory());// 设置超时时间(示例:10秒)RedisConnectionFactory factory = template.getConnectionFactory();if (factory instanceof LettuceConnectionFactory) {((LettuceConnectionFactory) factory).setTimeout(Duration.ofSeconds(10));}return template;
}
4. 监控Pending队列
定期检查Pending消息数量,异常时重置消费者:
// 在循环中添加检查
Long pending = redisTemplate.opsForStream().pending(streamKey, groupName);
if (pending > 1000) { // 阈值根据业务设定redisTemplate.opsForStream().destroyGroup(streamKey, groupName);redisTemplate.opsForStream().createGroup(streamKey, groupName);
}
5. 添加熔断机制
当外部接口连续失败时,暂停消费避免雪崩:
// 全局熔断器
CircuitBreaker breaker = CircuitBreaker.ofDefaults("frameProcessor");// 处理逻辑改为
breaker.executeSupplier(() -> {frameProcessor.process(...);return null;
});
📈 验证方案
-
日志监控:
// 在异步任务中添加耗时日志 long start = System.currentTimeMillis(); frameProcessor.process(...); log.info("处理耗时: {}ms", System.currentTimeMillis() - start);
确保单次处理时间低于Redis超时阈值。
-
Redis CLI检查:
XINFO GROUPS stream:1234 # 查看消费者组状态 XPENDING stream:1234 mygroup # 查看Pending消息
🌟 关键优化点总结
措施 | 作用 |
---|---|
异步处理 + 提前ACK | 彻底解耦处理耗时与Redis消费 |
增大Redis超时参数 | 避免服务端主动断开连接 |
线程池隔离 | 防止外部接口故障拖垮整个消费服务 |
熔断机制 | 快速失败,保护系统稳定性 |
通过以上改造,即使frameProcessor.process()
调用外部平台接口,也能通过异步化和资源隔离保证Redis消费链路的稳定性。最终只需确保线程池和熔断参数适配业务负载即可。
代码具体添加位置:
以下是针对 优化Redis连接配置、监控Pending队列 和 添加熔断机制 的代码具体添加位置的详细说明:
一、优化Redis连接配置
位置:Redis配置类
在Spring Boot的Redis配置类中调整超时参数(通常为RedisConfig.java
):
@Configuration
public class RedisConfig {@Beanpublic RedisTemplate<String, Object> redisTemplate(LettuceConnectionFactory lettuceConnectionFactory) {// 设置连接超时和读写超时(关键参数)lettuceConnectionFactory.setTimeout(Duration.ofSeconds(10)); // 命令超时时间lettuceConnectionFactory.setShareNativeConnection(false); // 禁用共享连接,避免阻塞RedisTemplate<String, Object> template = new RedisTemplate<>();template.setConnectionFactory(lettuceConnectionFactory);template.setKeySerializer(new StringRedisSerializer());template.setValueSerializer(new GenericJackson2JsonRedisSerializer());return template;}
}
关键参数说明:
setTimeout(10秒)
:确保超时时间大于frameProcessor.process()
的最长处理时间。setShareNativeConnection(false)
:避免多个线程共享同一个连接导致阻塞。
二、监控Pending队列
位置:consumeFrames
方法内的循环中
在消费消息的主循环中定期检查Pending队列:
private void consumeFrames(String streamId, String groupName, String consumerName,CommonQueues commonQueues, String regions) throws InterruptedException, IOException {// ... 其他初始化代码 ...int checkPendingInterval = 10; // 每处理10次循环检查一次Pending队列int loopCount = 0;while (!Thread.currentThread().isInterrupted()) {// ... 原有代码读取消息 ...// 监控Pending队列的逻辑(添加位置)loopCount++;if (loopCount % checkPendingInterval == 0) {String streamKey = "stream:" + streamId;PendingMessages pending = redisStreamOperations.pending(streamKey, groupName);if (pending != null && pending.getTotalPendingMessages() > 1000) { // 阈值根据业务调整log.warn("检测到Pending消息积压 {} 条,重置消费者组", pending.getTotalPendingMessages());redisStreamOperations.destroyGroup(streamKey, groupName);redisStreamOperations.createGroup(StreamKey.of(streamKey), groupName);}}// ... 后续处理代码 ...}
}
说明:
- 通过
redisStreamOperations.pending()
获取当前Pending消息数。 - 当Pending消息超过阈值时,强制销毁并重建消费者组,避免消息卡死。
三、添加熔断机制
位置:处理消息的业务逻辑外层
使用Resilience4j熔断器包裹frameProcessor.process()
调用:
1. 熔断器配置类
@Configuration
public class CircuitBreakerConfig {@Beanpublic CircuitBreaker frameProcessorCircuitBreaker() {CircuitBreakerConfig config = CircuitBreakerConfig.custom().failureRateThreshold(50) // 失败率阈值50%.slidingWindowType(SlidingWindowType.COUNT_BASED).slidingWindowSize(10) // 基于最近10次调用统计.minimumNumberOfCalls(5) // 最少5次调用后开始计算.waitDurationInOpenState(Duration.ofSeconds(30)) // 熔断后30秒进入半开状态.build();return CircuitBreakerRegistry.of(config).circuitBreaker("frameProcessor");}
}
2. 在消费代码中使用熔断器
public class YourConsumerClass {@Autowiredprivate CircuitBreaker frameProcessorCircuitBreaker; // 注入熔断器private void consumeFrames(...) {// ... 原有代码 ...for (MapRecord<String, Object, Object> record : records) {redisStreamOperations.ack(...); // 提前ACK// 使用熔断器保护处理逻辑(添加位置)Try.runRunnable(() -> frameProcessorCircuitBreaker.executeRunnable(() -> {String strdata = (String) record.getValue().get("frameData");byte[] frameData = Base64.getDecoder().decode(strdata);BufferedImage image = ImageIO.read(new ByteArrayInputStream(frameData));frameProcessor.process(image, streamId, commonQueues, regions);})).onFailure(e -> log.error("处理失败且熔断: {}", e.getMessage()));}// ... 后续代码 ...}
}
熔断逻辑说明:
- 当
frameProcessor.process()
连续失败触发阈值时,熔断器会暂时阻止后续调用,避免雪崩效应。 - 熔断期间直接跳过处理,但仍会ACK消息(根据业务需求选择是否重试)。
四、代码集成位置总结
优化措施 | 代码位置 | 关键注解 |
---|---|---|
Redis连接配置 | Redis配置类(如RedisConfig.java ) | 调整超时时间和连接池参数 |
Pending队列监控 | consumeFrames 方法的主循环内 | 定期检查+自动重置消费者组 |
熔断机制 | 业务处理代码外层(包裹frameProcessor.process ) | 依赖熔断器库(如Resilience4j) |
五、参数调整建议
-
Redis超时:
lettuceConnectionFactory.setTimeout
应大于frameProcessor.process()
的最大处理时间 + 网络抖动余量(如设置为实际最大处理时间的2倍)。
-
Pending队列阈值:
- 如果每秒处理100条消息,阈值可设置为
1000
(相当于10秒积压量)。
- 如果每秒处理100条消息,阈值可设置为
-
熔断器参数:
failureRateThreshold
:根据外部接口的稳定性调整(如频繁超时可设为70%)。waitDurationInOpenState
:根据外部服务恢复时间调整(如30秒到5分钟)。
通过以上改造,即使frameProcessor.process()
调用外部平台接口,也能通过资源隔离、快速失败和自动恢复机制保障Redis消费链路的稳定性。