espnet add ASR evaluation for long speech

What?

Add evaluation scripts for long speech ASR.

Why?

For long speech (e.g. longer than 20 seconds), we should first split it into shorter segments and then evaluate ASR performance. In this PR, we added code to split the speech based on the threshold.

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 70.10%. Comparing base (d004740) to head (85212a5). Report is 524 commits behind head on master.

Additional details and impacted files

@@             Coverage Diff             @@
##           master    #5696       +/-   ##
===========================================
+ Coverage   23.30%   70.10%   +46.80%     
===========================================
  Files         746      746               
  Lines       69369    69369               
===========================================
+ Hits        16163    48634    +32471     
+ Misses      53206    20735    -32471

Flag	Coverage Δ
test_configuration_espnet2	`∅ <ø> (∅)`
test_integration_espnet1	`62.92% <ø> (ø)`
test_python_espnet1	`18.32% <ø> (ø)`
test_python_espnet2	`52.05% <ø> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Mar 07 '24 00:03 codecov[bot]

This pull request is now in conflict :(

Jul 10 '24 01:07 mergify[bot]

add ASR evaluation for long speech

What?

Why?

See also

Codecov Report