Abdulmajeed
Abdulmajeed
I really loved, the way you thought about it, it's out of the box thinking :P. This is how I solved it. It would be cool if you have your...
@deshraj here's the update.
@deshraj I mostly use [excalidarw](https://excalidraw.com/), it's pretty decent tool. I love it.
I added the following to augmentor, and used the following snippet from online augmentation tutorial ```python rir_data_path = f'{data_dir}/dataset' !python {NEMO_ROOT}/scripts/dataset_processing/get_openslr_rir_data.py --data_root {rir_data_path} rir_manifest_path = os.path.join(rir_data_path, 'processed', 'rir.json') !head -n...
@nithinraok that's what I thought, However in Titanet-Large they use noise instead of impulse, and it says we are using impulse perturbation. So, does that mean in their training they...
@cachho Follow up
No worries, by all means. I just love to credit people's work. Anyways, amazing work!
fine-tuning the parameters **rows per band** and **number of bands** in MinHash depends heavily on the data and the specific use case, such as fuzzy deduplication of text pairs in...
@solee0022, I've identified the root of the issue. It's related to my training configuration file. Inside ```configs/your_config_name.json```, there's a flag **cleaned_text: True**. When set to True, the model assumes that...