johnchienbronci issues

Results 11 issues of


                                            johnchienbronci

Unsupported action found at bulk response.

Hi, elasticlient lib version: version-02 elastic-search version: 7.17.0 I'm bulk perform fail and without error. Checking log found a msg: Unsupported 'action' found at bulk response. deatail log: ``` Host...

test short audio file with silence between sentence, word timestamp may not be accurate

version: master (e9a082dcf27647eb52585cda6f115454c0ac6856) voice file: [test.mp3.zip](https://github.com/guillaumekln/faster-whisper/files/11170404/test.mp3.zip) voice file: Sentence and sentence duration of silence > 2s audio wav: set enable vad_filter and word_timestamp. Testing different scenarios, the word cannot be...

CUDA error: an illegal memory access was encountered

I encountered some errors when running the run_speech_recognition_ctc_streaming.sh by `deepspeed` ( `torchrun --nproc_per_node 1 ... `) and his issue consistently occurs with my custom corpora. Does anyone have any ideas?...

How can I test TCP and check recv package?

Hello, I have a testing project that involves the use of the TCP protocol. Reference: (https://github.com/processone/tsung/blob/develop/examples/raw.xml.in) However, I'm unsure about how to perform operations using the recv function. Are there...

kenlm 訓練自定義語言模型後似乎在糾錯及校正錯字上沒有效果

我目前測試是針對特定句子做訓練 text: .... 讓座今天應該 ... 訓練資訊 model = pycorrector.Corrector(language_model_path='corpus/lm.klm') correct_sent, detailect_sent = model.correct("少先隊員因該為老人讓坐") 結果沒有找出任何錯字想請教是因為什麼原因導致校正錯字上沒效果

question

wontfix

KeyError: 'transcription'

``` File "/usr/local/bin/source/huggingsound/examples/speech_recognition/finetune.py", line 71, in model.finetune( File "/usr/local/lib/python3.10/dist-packages/huggingsound/speech_recognition/model.py", line 353, in finetune train_dataset = self._get_dataset(processor, text_normalizer, train_data, train_data_cache_dir, training_args.length_column_name, num_workers) File "/usr/local/lib/python3.10/dist-packages/huggingsound/speech_recognition/model.py", line 272, in _get_dataset dataset = self._prepare_dataset_for_finetuning(dataset,...

How to use narrowband?

Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/lhotse/cut/mixed.py", line 285, in __getattr__) = self._assert_one_data_cut_with_attr_and_return_it_with_track_index(name) File "/usr/local/lib/python3.10/dist-packages/lhotse/cut/mixed.py", line 378, in _assert_one_data_cut_with_attr_and_return_it_with_track_index assert len(non_padding_cuts_with_custom_attr) == 1, ( AssertionError: This MixedCut has 0 non-padding...

trunction cause duration < 0

@pzelasko Hi As long as I add noise, and only read the segmentation audio file with (offset + duration) result: new_duration< 0 error ``` File "/usr/local/lib/python3.10/dist-packages/lhotse/dataset/sampling/dynamic_bucketing.py", line 299, in _next_batch...

[Bug] The number of declared samples in the recording diverged from the one obtained when loading audio (offset is 0.0)

@pzelasko sample rate: 16k The total length of the sound file: 48.92 s Read a segment of the audio file: offset: 0.0 duration: 4.948 and the following error will occur:...

Mismatch between x.size(1) and x_lens.max() due to subsampling length calculation

When training zipformer with CutConcatenate enabled, I hit an assertion error in subsampling.py in forward: ``` AssertionError: (1231, tensor(1232, device='cuda:5', dtype=torch.int32)) ``` The assertion is: assert x.size(1) == x_lens.max().item(), (x.size(1),...