zzchust
zzchust
``` def preprocess( sources, tokenizer: transformers.PreTrainedTokenizer, max_len: int, system_message: str = "You are a helpful assistant." ) -> Dict: roles = {"user": "user", "assistant": "assistant"} im_start = tokenizer.im_start_id im_end =...
Generating audio content based on video is very useful. Have you considered introducing this work in the near future? Here is an amazing work from open-mmlab. Foleycrafter: Bring silent videos...
Is there any demo code for compress audio and decompress codec ? Many thks.
测试了一条ucg的音频, 使用75token-large模型重建音频,效果很差,可以帮忙看看不? [测试音频-原始音频+重建音频.zip](https://github.com/user-attachments/files/18093320/-.%2B.zip)