xiongjun19
xiongjun19
in fact only 1000 samples from validation dataset are needed, according to the guide from nvidia, so you just need to select 1000 images from the origin validation fold, and...
> > Is there any fast ctc decoding interface, > > Please see its usage in https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/conformer_ctc/decode.py, which also supports CTC decoding in batches. > > > which device does...
> Not yet, distributed inference is indeed an interesting topic that is not supported in TVM yet. Ok thanks
> Please have a look at https://github.com/k2-fsa/icefall > > You can find tensorboard training logs in https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/RESULTS.md > > > 1. how big is the the transducer loss for a...
Dear csukuangfj! I have study the code carefully in https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/transducer_stateless/beam_search.py#L363 and modified according to my trained model and code structure, and compare it with decode mothed according to speechbrain, I...
> If you try the k2 pruned rnnt loss, https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/pruned_transducer_stateless/model.py#L160 , it is even faster, you may get `4.0/it`. [EDIT]: I thought it was training time. > > There is...
> > to my surprise, the loss is quite large > > Please clarify whether the loss is > > * the sum of the loss over all frames in...
> > so I guess, the loss is the sum of the loss over all frames in batch. > > Yes, you can divide it by the number of acoustic...