housebaby issues

Results 9 issues of


                                            housebaby

History state info lost in nnet computer in cuda-decoder, which cause accuray decreases

In cuda-decoder, it seems that if a rnn model is used, the NnetComputer will re-initialized for each chunk. That's ok for non-recurrent network . But for recurrent network , it...

discussion

decoder error use multiple GPUs and multiple threads, with cuda-10.2

We derived a multi-thread multi-GPU version decoder, which can parallelly decode many utterances on multi-GPUs. for example, 40 threads evenly on 4 GPU . In the main thread, we init...

kaldi10-TODO

stale

waiting-for-feedback

Is it easier to implement other model? conformer

conformer is an upgraded version from transformer , which insert a convolution layer between each attention layer Is it easier to adapt to other models ?

bad performance for streaming transformer using trigger

Hello, I trained a streaming transformer with following config, it seams that the loss is OK but the decoding performance is bad. Is it neccesary to use prefix-decoder ? When...

no speedup using ort

I have tried using ort in training transformer . But it seems that no speed up is got. I wonder whether i have missed someting in configuration.

how to use peoples speech dataset?

I have downloaded the people speech dataset, and have two questions: ![image](https://user-images.githubusercontent.com/9246556/146738776-4eccf7c7-11bd-48fc-82f3-c2861dcd9a59.png) 1. How to parse the two files? ![image](https://user-images.githubusercontent.com/9246556/146738384-1ee10efc-2277-4d67-b3d2-3d72abdc52c1.png) 2. I notice there are two options: clean / other...

update torch to 2.3.0+cu121, torchaudio fail in func tar_file_and_group of wenet/dataset/datapipes.py

![image](https://github.com/wenet-e2e/wenet/assets/9246556/e4b459b9-9b6f-4d4a-8534-a6f583f86d4f) ![image](https://github.com/wenet-e2e/wenet/assets/9246556/038adf60-c1a5-4d56-8960-93e251510f1a) During the dataset loading before training , it failed. But when I put the loading script in a single file like test.py , the tar file can be...

why is it so slow to prepare train.json

To check that the finetune process is ok , i use 30w sentences for training I found it is very slow to prepare train.json. 8 hours passed, the train.json of...

question

Killed when using long list of training data, How to solve it ?

### System Info what does split mean hear,as no difference between train or others? self.data_list = [] if split == "train": with open(dataset_config.train_data_path, encoding='utf-8') as fin: for line in fin:...