NeMo icon indicating copy to clipboard operation
NeMo copied to clipboard

[ASR][Tools] RIR corpus generator

Open anteju opened this issue 3 years ago • 0 comments

What does this PR do ?

This PR aims to add a tool for generating a set of multi-channel RIRs.

Collection: ASR

Changelog

  • Added RoomCorpusGenerator in asr/data/data_simulation.py
  • Added unit tests in tests/collections/asr/test_asr_data_simulation.py
  • Added an example config and a readme file in tools/rir_corpus_generator

Usage

The tool can be used as

python rir_corpus_generator.py output_dir=OUTPUT_DIR

where OUTPUT_DIR is a path to the output directory. This will use the default configuration in conf/rir_corpus_v1.yaml.

The output will be structured as

OUTPUT_DIR
+--{train, dev, test}
|	+--*.h5
+--config.yaml
+--{train, dev, test}_manifest.yaml
+--{train, dev, test}_info.png

Each directoy, e.g, {train, dev, test}, corresponds to a subset of data and contain the *.h5 files with RIRs. Corresponding *_manifest.yaml files contain metadata for each subset.

Before your PR is "Ready for review"

Pre checks:

  • [x] Make sure you read and followed Contributor guidelines
  • [x] Did you write any new necessary tests?
  • [x] Did you add or update any necessary documentation?
  • [ ] Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • [ ] Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • [x] New Feature
  • [ ] Bugfix
  • [ ] Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed. Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • n/a

anteju avatar Sep 13 '22 18:09 anteju