DiffuseStyleGesture icon indicating copy to clipboard operation
DiffuseStyleGesture copied to clipboard

DiffuseStyleGesture: Stylized Audio-Driven Co-Speech Gesture Generation with Diffusion Models (IJCAI 2023) | The DiffuseStyleGesture+ entry to the GENEA Challenge 2023 (ICMI 2023, Reproducibility Awar...

Results 17 DiffuseStyleGesture issues
Sort by recently updated
recently updated
newest added

我阅读了论文,看到了文中使用了客观评价指标FGD来评价模型。我用自己的数据训练了新模型,但是我没有在项目中找到使用FGD指标的使用方法,无法测试自己的模型,作者能否提供一下相关的方法?

即:音频以Streaming的方式实时播放,同时计算出当前状态的Pose。类似于NV的Audio2Face的实时口型计算。

您好,请问该模型是否有预训练数据?是否支持中文? 如果我用自己的中文数据训练,是否支持(中文)?

Hi, I didn't find this file in the GitHub repo or the blender market, how can I get this file? Thank you in advance!

Suppose I have my own 3D model, how can I convert it to a smpl or smplx parametric model? Let's say this is my 3D model ![1](https://github.com/YoungSeng/DiffuseStyleGesture/assets/75164118/00fbb07d-fd14-4523-9fb3-82c1bc925def) Convert to two...

你好,我目前正在尝试将输出的.vbh文件转换成.mp4的骨骼视频文件。请问有没有说明骨骼每个点的位置信息的文档或者其他东西?

Hello, I used the Amass dance action dataset and jukebox audio extractor to extract audio features and send them to model training, but the generated results were not very good....

关于DiffuseStyleGesture+提取出的特征的维度,提取音频特征的维度为什么要这样设置:40+64+2+2+1024+1 为什么MFCC是40,log-mel是64,韵律特征是4等等,这样设置有什么特别的用意吗,为什么要这样取特征的维度

1)你放出的模型是只在ZEGGS上训练的吗,step为450000时候的模型,我按照指示重新训了似乎结果不及你公开的那个模型效果 2)关于diffusion实时加速师弟你有好建议吗?(我试了什么都不动,PLMS 和DDIM 都还是慢)

Hello, thanks for sharing code ! In the DiffuseStyleGesture , the model only use one audio feature , wavlm . But when extract wavlm feature from raw wav , the...

question