Anshu Kumar

Results 2 issues of Anshu Kumar

@mx-mark Is it possible to fine-tune ViViT model on my own video dataset with different set of classes? Also, what is the procedure to create new dataset?

Hello, I'm found difference between the generated audios from the provided demo notebook using Librispeech and the audios available on the web page. The generated audios lack naturalness compared to...