Moisés Horta Valenzuela
Moisés Horta Valenzuela
Hello, Thanks for this great work. I'm wondering if you could provide instructions on how to perform the Upscaling task? Thanks!
Hello, This is less of an issue but more of a feature request. I've found that controlling the Prior guided generation in realtime tends to produce either really good results...
Hi, I'm giving WaveGAN another go this year. Mainly, I'm wondering if it's possible to save compute time and perform transfer learning from the provided pre-trained checkpoints? I've tried directing...
Hi, I've got a script going which takes an input audio, crops it into 30 second chunks, passes each one consecutively to generate_with_chroma() function and then concatenates the results. Even...
Hi! I've been successfully training a new RAVEv2 model. What I noticed is that when training a model, the learned latent dimension changes radically from training phase 1 to 2....
Hello, Thanks for open-sourcing this work, it's very valuable for the increase in understanding for how denoising diffusion models behave in the domain of audio. I've started a new training...
Hello, Thanks for this great repo, been having a lot of fun with it. I'm wondering if it's possible to implement the newer ``audioldm_48khz`` checkpoint for finetuning? It seems it...
Hello, Thanks for this work! I noticed that the pre-trained 44.1Khz weights isn't doing the best job at reconstruction some music outside of the dataset scope. I'm wondering if you're...
Hello, I've been reading a lot of the SOTA papers on audio and video generation using Rectified Flows, and it seems most are using Transformers instead of Unets. Are there...
Hello, Thanks so much for open sourcing the code. I have been training an unconditioned RF model on audio latents, with really good quality results. Here's some audio examples: https://drive.google.com/file/d/169NMzxl0k5X8oqiadNs3e7sjlxz8V5Pk/view?usp=sharing...