diffwave-sr issues

about Loss_T

1

![image](https://github.com/yoyololicon/diffwave-sr/assets/32425101/9197dd7c-0d73-4103-8c11-0697e8af8532) Hello, I'm trying to train a model on the Opensinger dataset, but the Loss_T keeps going up and the resulting speech is almost incomprehensible, do you have any suggestions...

FlyToYourMooN

16 kHz checkpoint

6

Excuse me, what bandwidth range was your 16kHz model trained on, and can it extend waveforms from 2k, 4k, and 8k to 16k?

yxlu-0102

run lfilter on GPU

2

depends on yoyololicon/audio#12

yoyolicoris

enhancement

Inquiry about reconstruction loss

1

Hello. I have a question about below formula. How did you derive this? https://github.com/yoyololicon/diffwave-sr/blob/cab5c4e330c8b6d8b329a6c85812a7328fe3431c/loss.py#L20 In this research, audio data is used and is it continuous? I would appreciate your cooperation.

ken-take-it-so-so

good first issue

About LSD metric

9

Hi, I am now working on the evaluation on audio super metrics, and i am wondering whether the LSD metric lead to sub-optimal results? For example, the following STFT-image consists...

QA-MDT

diffwave-sr
diffwave-sr copied to clipboard

Metadata

about Loss_T

16 kHz checkpoint

run lfilter on GPU

Inquiry about reconstruction loss

About LSD metric

← Metadata

Owner

Metadata

diffwave-sr diffwave-sr copied to clipboard

Metadata

about Loss_T

16 kHz checkpoint

run lfilter on GPU

Inquiry about reconstruction loss

About LSD metric

← Metadata

Owner

Metadata

diffwave-sr
diffwave-sr copied to clipboard