k2 icon indicating copy to clipboard operation
k2 copied to clipboard

Array size is coming as negative

Open kbramhendra opened this issue 3 years ago • 17 comments

Hi, While I am decoding with k2, i am facing this issue Check failed: size >= 0 (-8388608 vs. 0) Array size MUST be greater than or equal to 0, given :-8388608 in k2.shortest_path, can you please help in this.

kbramhendra avatar Oct 19 '22 03:10 kbramhendra

Which decoding method are you using? How large is your graph used for decoding?

csukuangfj avatar Oct 19 '22 04:10 csukuangfj

Thanks for replying. The graph size is 1.1G, vocab size is 231670. I am using 1best decoding in k2.

kbramhendra avatar Oct 19 '22 04:10 kbramhendra

What's your max duration?

csukuangfj avatar Oct 19 '22 04:10 csukuangfj

To give you full context, I am using espnet + k2 , K2 i am using for decoding ( i took from icefall ), max duration of an audio file is 30 seconds.

kbramhendra avatar Oct 19 '22 04:10 kbramhendra

Further to add...this problem is happening when one short segments (~1 sec) is added in the batch with a bit long sentences (avg 9 sec), but it is getting recognised when processed alone. And if are increasing max_states then also it processing in the batch. I am not able to figure out the exact reason why its failing. please help in this.

kbramhendra avatar Oct 19 '22 09:10 kbramhendra

[F] /home/ngoel/k2/k2/csrc/array.h:501:void k2::Array1<T>::Init(k2::ContextPtr, int32_t, k2::Dtype) [with T = int; k2::ContextPtr = std::shared_ptrk2::Context; int32_t = int] Check failed: size >= 0 (-402279408 vs. 0) Array size MUST be greater than or equal to 0, given :-402279408

I am getting a similar error while decoding. No graph, no espnet, just plain k2 fast-beam-search decoding from a pruned-transducer-stateless2 model. The longest audio file is 186.4 seconds.

Python stack is File "/mnt/dsk1/icefall/egs/dh/pruned_transducer_stateless2/beam_search.py", line 77, in fast_beam_search_one_best best_path = one_best_decoding(lattice) File "/home/ngoel/icefall/icefall/decode.py", line 474, in one_best_decoding best_path = k2.shortest_path(lattice, use_double_scores=use_double_scores) File "/home/ngoel/k2/k2/python/k2/fsa_algo.py", line 594, in shortest_path ragged_arc, ragged_int = _k2.shortest_path(fsa.arcs, entering_arcs)

ngoel17 avatar Oct 20 '22 17:10 ngoel17

[F] /home/ngoel/k2/k2/csrc/array.h:501:void k2::Array1::Init(k2::ContextPtr, int32_t, k2::Dtype) [with T = int; k2::ContextPtr = std::shared_ptrk2::Context; int32_t = int] Check failed: size >= 0 (-402279408 vs. 0) Array size MUST be greater than or equal to 0, given :-402279408

I am getting a similar error while decoding. No graph, no espnet, just plain k2 fast-beam-search decoding from a pruned-transducer-stateless2 model. The longest audio file is 186.4 seconds.

Python stack is File "/mnt/dsk1/icefall/egs/dh/pruned_transducer_stateless2/beam_search.py", line 77, in fast_beam_search_one_best best_path = one_best_decoding(lattice) File "/home/ngoel/icefall/icefall/decode.py", line 474, in one_best_decoding best_path = k2.shortest_path(lattice, use_double_scores=use_double_scores) File "/home/ngoel/k2/k2/python/k2/fsa_algo.py", line 594, in shortest_path ragged_arc, ragged_int = _k2.shortest_path(fsa.arcs, entering_arcs)

I have been getting the same error with fast_beam_search decoding in icefall. I will try to dig into it a little more today.

desh2608 avatar Oct 24 '22 15:10 desh2608

Here is the full stack-trace:

[F] /exp/draj/mini_scale_2022/k2/k2/csrc/array.h:501:void k2::Array1<T>::Init(k2::ContextPtr, int32_t, k2::Dtype) [with T = int; k2::ContextPtr = std::shared_ptr<k2::Context>; int32_t = int] Check failed: size >= 0 (-797647121 vs. 0) Array size MUST be greater than or equal to 0, given :-797647121


[ Stack-Trace: ]
/exp/draj/mini_scale_2022/k2/build_debug/lib/libk2_log.so(k2::internal::GetStackTrace()+0x46) [0x2aabbe9a5082]
/exp/draj/mini_scale_2022/k2/build_debug/lib/libk2context.so(k2::internal::Logger::~Logger()+0x35) [0x2aabb4d0e059]
/exp/draj/mini_scale_2022/k2/build_debug/lib/libk2context.so(k2::Array1<int>::Init(std::shared_ptr<k2::Context>, int, k2::Dtype)+0x1f6) [0x2aabb4d12250]
/exp/draj/mini_scale_2022/k2/build_debug/lib/libk2context.so(k2::Array1<int>::Array1(std::shared_ptr<k2::Context>, int, k2::Dtype)+0x50) [0x2aabb4d0fd44]
/exp/draj/mini_scale_2022/k2/build_debug/lib/libk2context.so(k2::RaggedShape::RowIds(int)+0x457) [0x2aabb4ebdce3]
/exp/draj/mini_scale_2022/k2/build_debug/lib/libk2context.so(k2::ShortestPath(k2::Ragged<k2::Arc>&, k2::Array1<int> const&)+0x898) [0x2aabb4d836f7]
/exp/draj/mini_scale_2022/k2/build_debug/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x1275f4) [0x2aabafeb35f4]
/exp/draj/mini_scale_2022/k2/build_debug/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x1389a2) [0x2aabafec49a2]
/exp/draj/mini_scale_2022/k2/build_debug/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x1373b4) [0x2aabafec33b4]
/exp/draj/mini_scale_2022/k2/build_debug/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x133b6a) [0x2aabafebfb6a]
/exp/draj/mini_scale_2022/k2/build_debug/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x133c2c) [0x2aabafebfc2c]
/exp/draj/mini_scale_2022/k2/build_debug/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x82cb8) [0x2aabafe0ecb8]
python(+0x13c7ae) [0x5555556907ae]
python(_PyObject_MakeTpCall+0x3bf) [0x55555568525f]
python(_PyEval_EvalFrameDefault+0x5437) [0x55555572ee87]
python(_PyEval_EvalCodeWithName+0x260) [0x5555557201f0]
python(_PyFunction_Vectorcall+0x594) [0x5555557217b4]
python(_PyEval_EvalFrameDefault+0x1517) [0x55555572af67]
python(_PyEval_EvalCodeWithName+0x260) [0x5555557201f0]
python(_PyFunction_Vectorcall+0x534) [0x555555721754]
python(_PyEval_EvalFrameDefault+0x71a) [0x55555572a16a]
python(_PyEval_EvalCodeWithName+0x260) [0x5555557201f0]
python(_PyFunction_Vectorcall+0x594) [0x5555557217b4]
python(_PyEval_EvalFrameDefault+0x1517) [0x55555572af67]
python(_PyEval_EvalCodeWithName+0xd5f) [0x555555720cef]
python(_PyFunction_Vectorcall+0x594) [0x5555557217b4]
python(_PyEval_EvalFrameDefault+0x1517) [0x55555572af67]
python(_PyEval_EvalCodeWithName+0x260) [0x5555557201f0]
python(_PyFunction_Vectorcall+0x594) [0x5555557217b4]
python(_PyEval_EvalFrameDefault+0x1517) [0x55555572af67]
python(_PyFunction_Vectorcall+0x1b7) [0x5555557213d7]
python(PyObject_Call+0x7d) [0x55555568b57d]
python(_PyEval_EvalFrameDefault+0x1dd3) [0x55555572b823]
python(_PyEval_EvalCodeWithName+0xd5f) [0x555555720cef]
python(_PyFunction_Vectorcall+0x594) [0x5555557217b4]
python(_PyEval_EvalFrameDefault+0x71a) [0x55555572a16a]
python(_PyEval_EvalCodeWithName+0x260) [0x5555557201f0]
python(PyEval_EvalCode+0x23) [0x555555721aa3]
python(+0x241382) [0x555555795382]
python(+0x252202) [0x5555557a6202]
python(+0x2553ab) [0x5555557a93ab]
python(PyRun_SimpleFileExFlags+0x1bf) [0x5555557a958f]
python(Py_RunMain+0x3a9) [0x5555557a9a69]
python(Py_BytesMain+0x39) [0x5555557a9c69]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x2aaaab616445]
python(+0x1f7427) [0x55555574b427]

Traceback (most recent call last):
  File "pruned_transducer_stateless2/decode.py", line 732, in <module>
    main()
  File "/home/hltcoe/draj/.conda/envs/scale/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "pruned_transducer_stateless2/decode.py", line 711, in main
    results_dict = decode_dataset(
  File "pruned_transducer_stateless2/decode.py", line 470, in decode_dataset
    hyps_dict = decode_one_batch(
  File "pruned_transducer_stateless2/decode.py", line 319, in decode_one_batch
    hyp_tokens = fast_beam_search_one_best(
  File "/exp/draj/mini_scale_2022/icefall/egs/ami/ASR/pruned_transducer_stateless2/beam_search.py", line 78, in fast_beam_search_one_best
    best_path = one_best_decoding(lattice)
  File "/exp/draj/mini_scale_2022/icefall/icefall/decode.py", line 474, in one_best_decoding
    best_path = k2.shortest_path(lattice, use_double_scores=use_double_scores)
  File "/exp/draj/mini_scale_2022/k2/k2/python/k2/fsa_algo.py", line 594, in shortest_path
    ragged_arc, ragged_int = _k2.shortest_path(fsa.arcs, entering_arcs)
RuntimeError: 
    Some bad things happened. Please read the above error messages and stack
    trace. If you are using Python, the following command may be helpful:

      gdb --args python /path/to/your/code.py

    (You can use `gdb` to debug the code. Please consider compiling
    a debug version of k2.).

    If you are unable to fix it, please open an issue at:

      https://github.com/k2-fsa/k2/issues/new

desh2608 avatar Oct 24 '22 16:10 desh2608

@desh2608 @ngoel17 Could you submit the lattice so I could reproduce your error?

glynpu avatar Oct 25 '22 01:10 glynpu

@desh2608 @ngoel17 You could save the lattice with following statement and share it to us with goole drive.

torch.save(lattice.as_dict(), "lattice_triggering_shortest_path_error.pt")

glynpu avatar Oct 25 '22 15:10 glynpu

@glynpu You can find the erroneous lattice here: error_lattice.tar.gz

Following your advice in this comment, I modified this line in icefall to:

best_path = k2.shortest_path(k2.arc_sort(k2.connect(lattice)), use_double_scores=False)

After this, the decoding could run successfully, but the WER was worse than greedy search.

desh2608 avatar Oct 25 '22 15:10 desh2608

Thanks for the report!

After this, the decoding could run successfully, but the WER was worse than greedy search.

I have two questions about this:

Q1: Was the WER worse for this single sentence only or the whole test set? Q2: What's the difference of detail errors? Is there a typical pattern? like more deletions or insertion at the end, or other things like this?

If the error could be solved by k2.connect(lattice), I suspect that the recognized result maybe an empty string.

glynpu avatar Oct 25 '22 16:10 glynpu

The erroneous lattice here: error_lattice.tar.gz is non-connected. You could also verify this with following code:

import k2                                                                                                                       
import torch                                                                                                                    
lattice_state = torch.load('lattice_triggering_shortest_path_error.pt')                                                         
lattice = k2.Fsa.from_dict(lattice_state)                                                                                       
connected_lattice = k2.connect(lattice)   
# connected_lattice.num_arcs == 0, means zero arcs left                                                                                      
print(connected_lattice.num_arcs)

It's quite abnormal that a non-connect lattice generated.

I would very much appreciate it if you could upload the related model and input feature to help us debug what results in a non-connected lattice.

glynpu avatar Oct 25 '22 16:10 glynpu

Thanks for the report!

After this, the decoding could run successfully, but the WER was worse than greedy search.

I have two questions about this:

Q1: Was the WER worse for this single sentence only or the whole test set? Q2: What's the difference of detail errors? Is there a typical pattern? like more deletions or insertion at the end, or other things like this?

If the error could be solved by k2.connect(lattice), I suspect that the recognized result maybe an empty string.

Yes, it seems you are right. The error stats are as follows: Greedy search: %WER = 22.66 Errors: 2315 insertions, 6059 deletions, 13146 substitutions, over 94951 reference words (75746 correct) Fast beam search (after change): %WER = 22.92 Errors: 2270 insertions, 6959 deletions, 12536 substitutions, over 94951 reference words (75456 correct)

So it seems the additional errors are all deletions. I checked the generated hypothesis, and several of them are empty --> these must be the ones that had a non-connected lattice. I'll package the model and the input corresponding to the error lattice and attach it here in a bit.

desh2608 avatar Oct 25 '22 16:10 desh2608

I'll package the model and the input corresponding to the error lattice and attach it here in a bit

Thanks! It would be much better if you also provide git-sha1(commit id) of your icefall/k2.

glynpu avatar Oct 25 '22 16:10 glynpu

Here is the package: Google drive link

It contains the following:

empty_lattice_package
|-- bpe.model
|-- cuts_dev_ihm_error.tar.gz
`-- pretrained.pt

0 directories, 3 files

The cuts are stored in webdataset format, so you can load them using CutSet.from_webdataset(). I identified 5 utterances that were producing empty outputs and packaged them in the cut set.

Decoding parameters:

    --decoding-method fast_beam_search \
    --beam-size 4 \
    --max-contexts 4 \
    --max-states 8

Git details (using git rev-parse HEAD):

  • k2: 1b151e5280046757a118fd1e9a5ed9dde8d18345
  • icefall: 9b671e1c21c190f68183f05d33df1c134079ca18

Let me know if you need any more details.

desh2608 avatar Oct 25 '22 17:10 desh2608

Let me know if you need any more details.

Thanks for your package! I could reproduce the error now.

glynpu avatar Oct 26 '22 13:10 glynpu