Peter Ross
Peter Ross
Fixes for making https://github.com/geohot/tinygrad/commit/084bdd0f49541c43a3ec4635b670ec58f855e7b3 run with CPU=1. There may be a better way to make Tensor.neg work for integer types. Open to ideas.
use [diff -w](https://github.com/geohot/tinygrad/commit/f13c2a996099a738221a4d04de2e15432d45df83?w=1) to see run_onnx changes test/external/external_test_onnx_backend.py: before: 167 failed, 645 passed, 1822 skipped, 1 warning in 28.40s after: 167 failed, 646 passed, 1821 skipped, 1 warning in 28.58s
$subhect. BusyBox shell does this: ``` /tmp # sed --version This is not GNU sed version 4.0 ```
Using icefall/egs/librispeech/ASR/pruned_transducer_stateless7 recipe, using only train-clean-5 and dev-clean-2 to train a model, and running pruned_transducer_stateless7/decode.py on GPU with --decoding-method fast_beam_search_nbest_LG produces the following error. ``` [F] /home/user/k2/k2/csrc/top_sort.cu:324:k2::FsaVec k2::TopSorter::TopSort(k2::Array1*) Check failed:...
I have been experimenting with the pre-trained zipformer streaming models: * https://huggingface.co/marcoyang/icefall-libri-giga-pruned-transducer-stateless7-streaming-2023-04-04 * https://huggingface.co/pkufool/icefall-asr-zipformer-streaming-wenetspeech-20230615 A problem I observe is the models do not recognise more than two single-syllable words/characters in...
by default, the sentencepiece model outputs U+2047 (⁇) as the unknown symbol text. This options allows the unknown symbol text to be customised.
This patch adds a audio codec transformation. I have found that when applying K2 ASR to speech compressed with mulaw, it is advantageous to augment the training data with these...
Various fixes :)