ahaliassos
ahaliassos
Hi, any news on when these two transforms will be added to albumentations?
Hi, The provided checkpoint was trained on FF++ with c23 (HQ) compression, so the HQ results should match. I'm not sure why they do not currently. Do the final mouth...
Hi, Indeed, the code assumes that each video contains one face that needs to be extracted. For example, in FF++, only the largest face is tracked and extracted (see Appendix...
Does `{working directory}/outputs/YYYY-MM-DD/HH-MM-SS/stage2/eval.py` exist. And are you sure you're not running eval.py from that directory rather than `{working directory}/stage2/eval.py`?
Hi, thanks for spotting this! You are right, the parameters were wrong. In general, the width of the decoder should match that of the encoder, except in the low-resource setting,...
Fixed, thank you.
The high-resource base LRS3+Vox2 weights file was corrupted, but I have now uploaded it again. I also fixed the config for the visual backbone (which was accidentally not pushed to...
Hi, you can have a look at https://github.com/ahaliassos/usr/blob/main/utils/labels/unigram1000_units.txt for the vocabulary and at https://github.com/ahaliassos/usr/blob/main/utils/utils.py#L6 for how the final token list is obtained (i.e., with the blank and eos tokens). Hope...