howitry comments

Results 7 comments of


                                            howitry

视频里说的有点乱，一句话告诉你这模型是干啥的

推理阶段的hubert+vq不是用text+参考音频的离散ssl自回归预测出来的吗？推理阶段生成出来的hubert+vq不应该就想包含参考音色吗，为什么会少音色泄露？

Some questions about flow

> we use speech tokenizer, which means we must use flow model to reconstruct the mel sequence I understand that flow is used to transform code to mel. But the...

CER Performance of Reconstructed Audio

> > When using the 40 tokens/s configuration, although the quality of the reconstructed audio is very good, there are often some mispronunciations. Have you measured the CER performance of...

Support AcceleratorConfig.use_stateful_dataloader in Trainer

> Hey @byi8220, I'll review this PR, sorry for the wait ! Are you still up to tweak/finish this PR with my guidance ? If so, if you can first...

Are there any plans to optimize the fetcher_state in StatefulDataLoader?

> we keep track of number of batches yielded, so the batches that were prefetched by the workers but not yielded by the dataloader, they are fetched again and yielded...

Are there any plans to optimize the fetcher_state in StatefulDataLoader?

> [@ramanishsingh](https://github.com/ramanishsingh) can clarify on how we don't "lose data" and we resume on correct data. > > But regarding saving actual data in ckpt, that is not feasible because...

Data duplication with `split_dataset_by_node` and `interleaved_dataset`

> split_dataset_by_node Hello, I have some questions about your intended use: (1) It seems unnecessary to use interleaving for a single dataset. (2) For multiple datasets, it seems possible to...