虚幻引擎 | 实时语音转口型 Multilingual lipsync

实时语音转口型：EPIC的metahuman sdk，NVIDIA的audio2face，都好。本文使用metahuman sdk

需要工具：Metahuman SDK网页账号，获取两日免费tokens

https://space.metahumansdk.io/#/unauthorized

————————————————————正文开始————————————————

runtime文字转语音节点！！！

https://docs.metahumansdk.io/metahuman-sdk/reference/metahumansdk-unreal-engine-plugin/text-to-speech

Runtime语音转Lipsync 节点！！！

降低延迟

Short delay runtime generation

When using lip sync generation in runtime, you may encounter the problem with response time. The thing is that its duration directly depends on the length of the incoming audio file.

To reduce the playback delay, we suggest use short-delay implementation of audio-to-lipsync method. This method allows you to get the animation in chunks as they are generated, without waiting until the file is fully processed.

During the request execution a buffer will be created where the animation chunks are added as the data is received. When the buffer is full enough to start playing, the OnFilled event is triggered. After that you can start playing the resulting animation sequentially, selecting the necessary chunks from the buffer.

官网上这段是在讨论如何减少在运行时生成唇动同步（lip sync）时的播放延迟。唇动同步的生成时间直接受到音频文件长度的影响，这意味着音频文件越长，生成唇动动画的时间就越长，从而导致较长的播放延迟。

为了减少这种延迟，建议使用“短延迟实现”的音频到唇动同步方法。这个方法的工作原理是：将音频处理过程分成若干个小块，而不是等到整个音频文件处理完再生成唇动动画。这样可以在音频文件被完全处理之前，就开始生成并播放唇动动画。

具体来说，系统会在处理音频时创建一个缓冲区，将生成的动画小块（chunks）逐步添加到这个缓冲区中。当缓冲区的内容足够多时，会触发一个“OnFilled”事件。此时，你可以开始播放这些动画小块，按顺序选择缓冲区中的小块进行播放，从而减少总体播放延迟。

“OnFilled”事件指的是当缓冲区达到一定容量时触发的一个信号或回调事件。具体来说，在音频到唇动同步的过程中，生成的动画小块会被逐步添加到缓冲区中。当缓冲区内的数据达到一个足够的量时，这个“OnFilled”事件会被触发，表示现在已经有足够的动画数据可以开始播放了。

语音效果
(1)google 谷歌引擎

⑥cmn-CN-Wavenet-C:男机器声(感觉这个最好)

(2)azure 引擎（效果比谷歌的更好）
普通话：

②zh-CN-XiaohanNeural 女声（声音比①好听）

③zh-CN-XiaohuangNeural 萝莉音

④zh-CN-XiaoxiaoNeural:好听的小姐姐的声音
————————————————

UE5中插件MetaHumanSDK的使用_metahuman sdk-CSDN博客

总结！！！：

文字实时聊天：chatgpt / 文心一言千帆（openAI API）VArest

文字转语音：讯飞 / elevenlabs /metahuman sdk /AzSpeech-Voice and Text (5.4

语音转lipsync：metahuman sdk / audio2 face

PLUGIN

varest
switchlanguage
openai api
runtime speech recognizer
runtime speech importer
metahuman sdk
metahuman
AzSpeech-Voice and Text (5.4)

A plugin that integrates Azure Speech Cognitive Services into Unreal Engine with simple functions which can do these asynchronous tasks:

该插件将 Azure 语音认知服务集成到虚幻引擎中，具有可以执行以下异步任务的简单功能

虚幻引擎 | 实时语音转口型 Multilingual lipsync

runtime文字转语音节点！！！

Runtime语音转Lipsync 节点！！！

降低延迟

语音效果
(1)google 谷歌引擎

总结！！！：

PLUGIN

相关资讯

热文排行

最新新闻

推荐新闻

热搜词

虚幻引擎 | 实时语音转口型 Multilingual lipsync

runtime文字转语音 节点！！！

Runtime语音转Lipsync 节点！！！

降低延迟

语音效果 (1)google 谷歌引擎

总结！！！：

PLUGIN

相关资讯

热文排行

最新新闻

推荐新闻

热搜词

runtime文字转语音节点！！！

语音效果
(1)google 谷歌引擎