housebaby
housebaby
In cuda-decoder, it seems that if a rnn model is used, the NnetComputer will re-initialized for each chunk. That's ok for non-recurrent network . But for recurrent network , it...
We derived a multi-thread multi-GPU version decoder, which can parallelly decode many utterances on multi-GPUs. for example, 40 threads evenly on 4 GPU . In the main thread, we init...
conformer is an upgraded version from transformer , which insert a convolution layer between each attention layer Is it easier to adapt to other models ?
Hello, I trained a streaming transformer with following config, it seams that the loss is OK but the decoding performance is bad. Is it neccesary to use prefix-decoder ? When...
I have tried using ort in training transformer . But it seems that no speed up is got. I wonder whether i have missed someting in configuration.
I have downloaded the people speech dataset, and have two questions:  1. How to parse the two files?  2. I notice there are two options: clean / other...
  During the dataset loading before training , it failed. But when I put the loading script in a single file like test.py , the tar file can be...
To check that the finetune process is ok , i use 30w sentences for training I found it is very slow to prepare train.json. 8 hours passed, the train.json of...
### System Info what does split mean hear,as no difference between train or others? self.data_list = [] if split == "train": with open(dataset_config.train_data_path, encoding='utf-8') as fin: for line in fin:...