hypo.word file missing during MMS ASR inference
❓ Questions and Help
What is your question?
I'm facing the following issue while running the MMS ASR inference script examples/mms/asr/infer/mms_infer.py:
File "/workspace/fairseq/examples/mms/asr/infer/mms_infer.py", line 52, in <module>
process(args)
File "/workspace/fairseq/examples/mms/asr/infer/mms_infer.py", line 44, in process
with open(tmpdir/"hypo.word") as fr:
FileNotFoundError: [Errno 2] No such file or directory: '/workspace/tmpsjatjyxt/hypo.word'
Code
python examples/mms/asr/infer/mms_infer.py --model "/workspace/fairseq/mms1b_fl102.pt" --lang "urd-script_arabic" --audio "/workspace/audio.wav"
What have you tried?
Tried running the ASR on different audios and languages
What's your environment?
- fairseq Version (e.g., 1.0 or main): main
- PyTorch Version (e.g., 1.0): 2.0.0
- OS (e.g., Linux): Linux
- How you installed fairseq (
pip, source): pip - Build command you used (if compiling from source): N/A
- Python version: 3.10.10
- CUDA/cuDNN version: 11.6
- GPU models and configuration: NVIDIA A6000
- Any other relevant information: N/A
Facing the exact same issue
Hi, can you share the entire log? I just tested the code again and it works fine from my end.
you need to check what the error is, change your mms_infer.py to
out = subprocess.run(cmd, check=True, shell=True, stdout=subprocess.DEVNULL,)
print(out)
to see the error, for me it was I needed to pass cpu=True because I don't have CUDA installed. I did this by modifying my infer_common.yml file to have a new top level key common with the cpu: true key/val in it
common:
cpu: true
I am hitting this though and I am not sure what I am doing wrong, not sure if I am using the right lang_code, it doesn't say what the lang codes are or what standard it is referencing, I have tried en and en-US so far.
Sure here is the full log of mine
(base) hello_automate_ai@machinelearningnotebook:~/fairseqmmstest/fairseq$ python "examples/mms/asr/infer/mms_infer.py" --model "/home/hello_automate_ai/fairseqmmstest/mms1b_all.pt" --lang hin --audio "/home/hello_automate_ai/fairseqmmstest/audio.wav"
preparing tmp manifest dir ...
loading model & running inference ...
Traceback (most recent call last):
File "/home/hello_automate_ai/fairseqmmstest/fairseq/examples/speech_recognition/new/infer.py", line 18, in
This is after the fix suggested by audiolion
@audiolion We expect a 3-digit language code. See 'Supported languages' section in README file for each model. For example - use 'eng' for English.
@shsagnik
No module named 'editdistance' - You should install the missing module.
@shsagnik
ModuleNotFoundError: No module named 'editdistance'
you need to install the modules that are used
Got these errors this time
preparing tmp manifest dir ... loading model & running inference ... /home/hello_automate_ai/miniconda3/lib/python3.10/site-packages/hydra/core/plugins.py:202: UserWarning: Error importing 'hydra_plugins.hydra_colorlog'. Plugin is incompatible with this Hydra version or buggy. Recommended to uninstall or upgrade plugin. ImportError : cannot import name 'SearchPathPlugin' from 'hydra.plugins' (/home/hello_automate_ai/miniconda3/lib/python3.10/site-packages/hydra/plugins/init.py) warnings.warn( Traceback (most recent call last): File "/home/hello_automate_ai/miniconda3/lib/python3.10/pathlib.py", line 1175, in mkdir self._accessor.mkdir(self, mode) FileNotFoundError: [Errno 2] No such file or directory: '/checkpoint/hello_automate_ai/INFER/None'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/hello_automate_ai/miniconda3/lib/python3.10/pathlib.py", line 1175, in mkdir self._accessor.mkdir(self, mode) FileNotFoundError: [Errno 2] No such file or directory: '/checkpoint/hello_automate_ai/INFER'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/hello_automate_ai/miniconda3/lib/python3.10/pathlib.py", line 1175, in mkdir self._accessor.mkdir(self, mode) FileNotFoundError: [Errno 2] No such file or directory: '/checkpoint/hello_automate_ai'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/hello_automate_ai/fairseqmmstest/fairseq/examples/speech_recognition/new/infer.py", line 499, in
Getting pretty much the same, used the right 3 letter language code (while waiting on #5119 to be answered) and doesn't seem to have an effect, hypo.word error is showing up
I got this error when i want to try ASR on google colab
/content/fairseq
>>> preparing tmp manifest dir ...
>>> loading model & running inference ...
Traceback (most recent call last):
File "/content/fairseq/examples/speech_recognition/new/infer.py", line 21, in <module>
from examples.speech_recognition.new.decoders.decoder_config import (
File "/content/fairseq/examples/speech_recognition/__init__.py", line 1, in <module>
from . import criterions, models, tasks # noqa
File "/content/fairseq/examples/speech_recognition/criterions/__init__.py", line 15, in <module>
importlib.import_module(
File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/content/fairseq/examples/speech_recognition/criterions/cross_entropy_acc.py", line 13, in <module>
from fairseq import utils
File "/content/fairseq/fairseq/__init__.py", line 20, in <module>
from fairseq.distributed import utils as distributed_utils
File "/content/fairseq/fairseq/distributed/__init__.py", line 7, in <module>
from .fully_sharded_data_parallel import (
File "/content/fairseq/fairseq/distributed/fully_sharded_data_parallel.py", line 10, in <module>
from fairseq.dataclass.configs import DistributedTrainingConfig
File "/content/fairseq/fairseq/dataclass/__init__.py", line 6, in <module>
from .configs import FairseqDataclass
File "/content/fairseq/fairseq/dataclass/configs.py", line 12, in <module>
from omegaconf import II, MISSING
ModuleNotFoundError: No module named 'omegaconf'
CompletedProcess(args='\n PYTHONPATH=. PREFIX=INFER HYDRA_FULL_ERROR=1 python examples/speech_recognition/new/infer.py -m --config-dir examples/mms/asr/config/ --config-name infer_common decoding.type=viterbi dataset.max_tokens=4000000 distributed_training.distributed_world_size=1 "common_eval.path=\'/content/mms1b_fl102.pt\'" task.data=/tmp/tmp79w8mawp dataset.gen_subset="eng:dev" common_eval.post_process=letter decoding.results_path=/tmp/tmp79w8mawp\n ', returncode=1)
Traceback (most recent call last):
File "/content/fairseq/examples/mms/asr/infer/mms_infer.py", line 53, in <module>
process(args)
File "/content/fairseq/examples/mms/asr/infer/mms_infer.py", line 45, in process
with open(tmpdir/"hypo.word") as fr:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp79w8mawp/hypo.word'
Please y'all read the error messages and try to debug yourself.
@dakouan18
ModuleNotFoundError: No module named 'omegaconf'
you need to install the missing modules, one of them being omegaconf
@altryne you need to print the error output to debug
@shsagnik your hydra install has some issues, and you need to specify a checkpoint directory, it was setup to run on linux where you can make directories off the root (probably in a container) so change infer_common.yaml
Thanks @audiolion
It wasn't immediately clear that mms_infer.py calls the whole hydra thing via a command, as it obscures the errors that pop up there.
Here's the full output I'm getting (added a print out of the cmd command as well)
$ python examples/mms/asr/infer/mms_infer.py --model mms1b_l1107.pt --audio output_audio.mp3 --lang tur
>>> preparing tmp manifest dir ...
PYTHONPATH=. PREFIX=INFER HYDRA_FULL_ERROR=1 python examples/speech_recognition/new/infer.py -m --config-dir examples/mms/asr/config/ --config-name i
infer_common decoding.type=viterbi dataset.max_tokens=4000000 distributed_training.distributed_world_size=1 "common_eval.path='mms1b_l1107.pt'" task.data=C:\Users\micro\AppData\Local\Temp\tmpxzum3zve dataset.gen_subset="tur:dev" common_eval.post_process=letter decoding.results_path=C:\Users\micro\AppData\Local\Temmp\tmpxzum3zve
>>> loading model & running inference ...
Traceback (most recent call last):
File "C:\Users\micro\projects\mms\examples\mms\asr\infer\mms_infer.py", line 53, in <module>
process(args)
File "C:\Users\micro\projects\mms\examples\mms\asr\infer\mms_infer.py", line 45, in process
with open(tmpdir/"hypo.word") as fr:
^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\micro\\AppData\\Local\\Temp\\tmpxzum3zve\\hypo.word'
hi @audiolion, after installing omegaconf & hydra a new error appeared
>>> preparing tmp manifest dir ...
>>> loading model & running inference ...
2023-05-22 22:22:29.307454: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-05-22 22:22:30.440434: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Traceback (most recent call last):
File "/content/fairseq/examples/speech_recognition/new/infer.py", line 21, in <module>
from examples.speech_recognition.new.decoders.decoder_config import (
File "/content/fairseq/examples/speech_recognition/__init__.py", line 1, in <module>
from . import criterions, models, tasks # noqa
File "/content/fairseq/examples/speech_recognition/criterions/__init__.py", line 15, in <module>
importlib.import_module(
File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/content/fairseq/examples/speech_recognition/criterions/cross_entropy_acc.py", line 13, in <module>
from fairseq import utils
File "/content/fairseq/fairseq/__init__.py", line 33, in <module>
import fairseq.criterions # noqa
File "/content/fairseq/fairseq/criterions/__init__.py", line 18, in <module>
(
TypeError: cannot unpack non-iterable NoneType object
CompletedProcess(args='\n PYTHONPATH=. PREFIX=INFER HYDRA_FULL_ERROR=1 python examples/speech_recognition/new/infer.py -m --config-dir examples/mms/asr/config/ --config-name infer_common decoding.type=viterbi dataset.max_tokens=4000000 distributed_training.distributed_world_size=1 "common_eval.path=\'/content/mms1b_fl102.pt\'" task.data=/tmp/tmpk2ot70rk dataset.gen_subset="eng:dev" common_eval.post_process=letter decoding.results_path=/tmp/tmpk2ot70rk\n ', returncode=1)
Traceback (most recent call last):
File "/content/fairseq/examples/mms/asr/infer/mms_infer.py", line 53, in <module>
process(args)
File "/content/fairseq/examples/mms/asr/infer/mms_infer.py", line 45, in process
with open(tmpdir/"hypo.word") as fr:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpk2ot70rk/hypo.word'
Thanks @audiolion It wasn't immediately clear that
mms_infer.pycalls the whole hydra thing via a command, as it obscures the errors that pop up there.Here's the full output I'm getting (added a print out of the cmd command as well)
$ python examples/mms/asr/infer/mms_infer.py --model mms1b_l1107.pt --audio output_audio.mp3 --lang tur >>> preparing tmp manifest dir ... PYTHONPATH=. PREFIX=INFER HYDRA_FULL_ERROR=1 python examples/speech_recognition/new/infer.py -m --config-dir examples/mms/asr/config/ --config-name i infer_common decoding.type=viterbi dataset.max_tokens=4000000 distributed_training.distributed_world_size=1 "common_eval.path='mms1b_l1107.pt'" task.data=C:\Users\micro\AppData\Local\Temp\tmpxzum3zve dataset.gen_subset="tur:dev" common_eval.post_process=letter decoding.results_path=C:\Users\micro\AppData\Local\Temmp\tmpxzum3zve >>> loading model & running inference ... Traceback (most recent call last): File "C:\Users\micro\projects\mms\examples\mms\asr\infer\mms_infer.py", line 53, in <module> process(args) File "C:\Users\micro\projects\mms\examples\mms\asr\infer\mms_infer.py", line 45, in process with open(tmpdir/"hypo.word") as fr: ^^^^^^^^^^^^^^^^^^^^^^^^ FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\micro\\AppData\\Local\\Temp\\tmpxzum3zve\\hypo.word'
you need to do what I said in my first comment and output the process error message. the hyp.word file is not found because the actual ASR never ran and produced an output
SIGH, I am, it prints the command and that's it.
>>> loading model & running inference ...
CompletedProcess(args='\nPYTHONPATH=. PREFIX=INFER HYDRA_FULL_ERROR=1 python examples/speech_recognition/new/infer.py -m --config-dir examples/mms/asr/config/ --config-name infer_common decoding.type=viterbi dataset.max_tokens=4000000 distributed_training.distributed_world_size=1 "common_eval.path=\'mms1b_l1107.pt\'" task.data=C:\\Users\\micro\\AppData\\Local\\Temp\\tmp p9t2lty3_ dataset.gen_subset="tur:dev" common_eval.post_process=letter decoding.results_path=C:\\Users\\micro\\AppData\\Local\\Temp\\tmp9t2lty3_\n', returncode=0)
Traceback (most recent call last):
File "C:\Users\micro\projects\mms\examples\mms\asr\infer\mms_infer.py", line 55, in <module>
process(args)
File "C:\Users\micro\projects\mms\examples\mms\asr\infer\mms_infer.py", line 47, in process
with open(tmpdir/"hypo.word") as fr:
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\micro\\AppData\\Local\\Temp\\tmp9t2lty3_\\hypo.word'
However, when I go back and recreate that temp dir, and run the command manually myself I do seem to get errors.
Just for some reason not via the way you mentioned.
Had to install many packages on the way, here's a partial list (in case it helps anyone)
pip install torch==1.9.0+cu111 -f https://download.pytorch.org/whl/cu111/torch_stable.html
pip install hydra-core
pip install editdistance
pip install soundfile
pip install omegaconf
pip install hydra-core
pip install fairseq
pip install scikit-learn
pip install tensorboardX
Still getting nowhere. Running the subprocess command even with check=True and printing the output returns status code 0 with no errors.
Got the model to finally load and run, apparently windows doesn't allow : in directory names and the above code adds :dev to the directory name.
So if you pass --lang tur like I did, it will try to create a directory named /tur:dev inside the /checkpoint which per @audiolion I also had to.. change as /checkpoint doesn't seem to do anything on windows.
I think the full inference ran, as the process got stuck for a few minutes, the GPU went to 8GB (impressive) and after a while, I had 2 errors again.
the hypo.word error seems to be a "catch all" error that means... many things that could go wrong, hopefully the authors will clean it up?
I'm currently staring at this error, and am pretty sure that's due to me removing the : from the dir name
File "C:\Users\micro\projects\mms\examples\speech_recognition\new\infer.py", line 407, in main
with InferenceProcessor(cfg) as processor:
File "C:\Users\micro\projects\mms\examples\speech_recognition\new\infer.py", line 132, in __init__
self.task.load_dataset(
File "C:\Users\micro\projects\mms\fairseq\tasks\audio_finetuning.py", line 140, in load_dataset
super().load_dataset(split, task_cfg, **kwargs)
File "C:\Users\micro\projects\mms\fairseq\tasks\audio_pretraining.py", line 175, in load_dataset
for key, file_name in data_keys:
ValueError: not enough values to unpack (expected 2, got 1)
I had the same error with Google Colab and have investigated.
my error
>>> preparing tmp manifest dir ...
PYTHONPATH=. PREFIX=INFER HYDRA_FULL_ERROR=1 python examples/speech_recognition/new/infer.py -m --config-dir examples/mms/asr/config/ --config-name infer_common decoding.type=viterbi dataset.max_tokens=4000000 distributed_training.distributed_world_size=1 "common_eval.path='/content/mms1b_fl102.pt'" task.data=/content/tmp dataset.gen_subset="jpn:dev" common_eval.post_process=letter decoding.results_path=/content/tmp
>>> loading model & running inference ...
2023-05-22 22:02:52.055738: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
[2023-05-22 22:02:58,730][HYDRA] Launching 1 jobs locally
[2023-05-22 22:02:58,730][HYDRA] #0 : decoding.type=viterbi dataset.max_tokens=4000000 distributed_training.distributed_world_size=1 common_eval.path='/content/mms1b_fl102.pt' task.data=/content/tmp dataset.gen_subset=jpn:dev common_eval.post_process=letter decoding.results_path=/content/tmp
[2023-05-22 22:02:59,254][__main__][INFO] - /content/mms1b_fl102.pt
Killed
Traceback (most recent call last):
File "/content/fairseq/examples/mms/asr/infer/mms_infer.py", line 54, in <module>
process(args)
File "/content/fairseq/examples/mms/asr/infer/mms_infer.py", line 46, in process
with open(tmpdir/"hypo.word") as fr:
FileNotFoundError: [Errno 2] No such file or directory: '/content/tmp/hypo.word'
As it turns out, it was crashing at the following location.
https://github.com/facebookresearch/fairseq/blob/af12c9c6407bbcf2bca0b2f1923cf78f3db8857c/fairseq/models/wav2vec/wav2vec2.py#L1052
Looking at the RAM status, I believe the crash was caused by lack of memory.
So I feel that perhaps increasing the memory will solve the problem.
I hope this helps you in your investigation.
Getting same error. Also documentation to run sample is horrible.
I would say it isn't a catch all error, but rather that error handling from the subprocess call is not done, so if the call to run the inference fails for any reason, the hypo.word file will not have been created, and thus the open() call will fail and throw that error. So you have to dig backwards at the subprocess command to find out what happens. This just got open sourced so it makes sense there are some rough edges, contribute back to the repo!
edit: @altryne my bad I thought by your message you were printing the command out itself, not the output of running the command. Your error does look like its failing because of the lack of :. Good news is its open source so you could change : to another character, or run it on windows subsytem linux, or run it in docker.
I would say it isn't a catch all error, but rather that error handling from the subprocess call is not done, so if the call to run the inference fails for any reason, the
hypo.wordfile will not have been created, and thus theopen()call will fail and throw that error. So you have to dig backwards at the subprocess command to find out what happens. This just got open sourced so it makes sense there are some rough edges, contribute back to the repo!
Yeah, that's what I mean, if anything happens within the subprocess for any reason, folks are going to get the above mentioned error. Then they will likely google their way into this issue, which covers many of the possible ways it can fail. I was trying to be extra verbose for other folks to potentially help.
edit: @altryne my bad I thought by your message you were printing the command out itself, not the output of running the command. Your error does look like its failing because of the lack of
:. Good news is its open source so you could change:to another character, or run it on windows subsytem linux, or run it in docker.
Thanks! You helped a lot, I eventually had to rewrite that whole block like so:
import os
os.environ["TMPDIR"] = str(tmpdir)
os.environ["PYTHONPATH"] = "."
os.environ["PREFIX"] = "INFER"
os.environ["HYDRA_FULL_ERROR"] = "1"
os.environ["USER"] = "micro"
cmd = f"""python examples/speech_recognition/new/infer.py -m --config-dir examples/mms/asr/config/ --config-name infer_common decoding.type=viterbi dataset.max_tokens=4000000 distributed_training.distributed_world_size=1 "common_eval.path='{args.model}'" task.data={tmpdir} dataset.gen_subset="{args.lang}" common_eval.post_process={args.format} decoding.results_path={tmpdir}
"""
To even have the command execute and do something and not fail outright.
glad you got it working!
Hi, thanks for this discussion - I've learned a lot. This is the Dockerfile I created after a few hours trying to make it work:
FROM python:3.8
WORKDIR /usr/src/app
COPY . .
RUN pip install --no-cache-dir . \
&& pip install --no-cache-dir soundfile \
&& pip install --no-cache-dir torch \
&& pip install --no-cache-dir hydra-core \
&& pip install --no-cache-dir editdistance \
&& pip install --no-cache-dir soundfile \
&& pip install --no-cache-dir omegaconf \
&& pip install --no-cache-dir scikit-learn \
&& pip install --no-cache-dir tensorboardX \
&& python setup.py build_ext --inplace \
&& apt update \
&& apt -y install libsndfile-dev \
&& rm -rf /var/lib/apt/lists/* \
&& wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/bin/yq \
&& chmod +x /usr/bin/yq \
&& yq -i '.common.cpu = true' examples/mms/asr/config/infer_common.yaml
CMD [ "python", "examples/mms/asr/infer/mms_infer.py" ]
I built the image with:
docker build -t fairseq:dev .
And run it with:
docker run --rm -it -e USER=root -v $(pwd):/mms:ro fairseq:dev python examples/mms/asr/infer/mms_infer.py --model /mms/mms1b_fl102.pt --lang eng --audio /mms/audio.wav
I kept tracing error and solving them until i met this error:
File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 657, in _load_unlocked
File "<frozen importlib._bootstrap>", line 556, in module_from_spec
File "<frozen importlib._bootstrap_external>", line 1166, in create_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
ImportError: /usr/local/lib/python3.8/dist-packages/fused_layer_norm_cuda.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN3c106detail14torchCheckFailEPKcS2_jRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
Traceback (most recent call last):
Does anyone know a solution ?
Hi, thanks for this discussion - I've learned a lot. This is the Dockerfile I created after a few hours trying to make it work:
FROM python:3.8 WORKDIR /usr/src/app COPY . . RUN pip install --no-cache-dir . \ && pip install --no-cache-dir soundfile \ && pip install --no-cache-dir torch \ && pip install --no-cache-dir hydra-core \ && pip install --no-cache-dir editdistance \ && pip install --no-cache-dir soundfile \ && pip install --no-cache-dir omegaconf \ && pip install --no-cache-dir scikit-learn \ && pip install --no-cache-dir tensorboardX \ && python setup.py build_ext --inplace \ && apt update \ && apt -y install libsndfile-dev \ && rm -rf /var/lib/apt/lists/* \ && wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/bin/yq \ && chmod +x /usr/bin/yq \ && yq -i '.common.cpu = true' examples/mms/asr/config/infer_common.yaml CMD [ "python", "examples/mms/asr/infer/mms_infer.py" ]I built the image with:
docker build -t fairseq:dev .And run it with:
docker run --rm -it -e USER=root -v $(pwd):/mms:ro fairseq:dev python examples/mms/asr/infer/mms_infer.py --model /mms/mms1b_fl102.pt --lang eng --audio /mms/audio.wav
i run the code based on the docker, but it fails again
>>> preparing tmp manifest dir ...
>>> loading model & running inference ...
Traceback (most recent call last):
File "examples/speech_recognition/new/infer.py", line 499, in <module>
cli_main()
File "examples/speech_recognition/new/infer.py", line 495, in cli_main
hydra_main() # pylint: disable=no-value-for-parameter
File "/usr/local/lib/python3.8/site-packages/hydra/main.py", line 32, in decorated_main
_run_hydra(
File "/usr/local/lib/python3.8/site-packages/hydra/_internal/utils.py", line 354, in _run_hydra
run_and_report(
File "/usr/local/lib/python3.8/site-packages/hydra/_internal/utils.py", line 201, in run_and_report
raise ex
File "/usr/local/lib/python3.8/site-packages/hydra/_internal/utils.py", line 198, in run_and_report
return func()
File "/usr/local/lib/python3.8/site-packages/hydra/_internal/utils.py", line 355, in <lambda>
lambda: hydra.multirun(
File "/usr/local/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 136, in multirun
return sweeper.sweep(arguments=task_overrides)
File "/usr/local/lib/python3.8/site-packages/hydra/_internal/core_plugins/basic_sweeper.py", line 154, in sweep
results = self.launcher.launch(batch, initial_job_idx=initial_job_idx)
File "/usr/local/lib/python3.8/site-packages/hydra/_internal/core_plugins/basic_launcher.py", line 76, in launch
ret = run_job(
File "/usr/local/lib/python3.8/site-packages/hydra/core/utils.py", line 129, in run_job
ret.return_value = task_function(task_cfg)
File "examples/speech_recognition/new/infer.py", line 460, in hydra_main
distributed_utils.call_main(cfg, main)
File "/usr/src/app/fairseq/distributed/utils.py", line 404, in call_main
main(cfg, **kwargs)
File "examples/speech_recognition/new/infer.py", line 407, in main
with InferenceProcessor(cfg) as processor:
File "examples/speech_recognition/new/infer.py", line 132, in __init__
self.task.load_dataset(
File "/usr/src/app/fairseq/tasks/audio_finetuning.py", line 140, in load_dataset
super().load_dataset(split, task_cfg, **kwargs)
File "/usr/src/app/fairseq/tasks/audio_pretraining.py", line 150, in load_dataset
if task_cfg.multi_corpus_keys is None:
File "/usr/local/lib/python3.8/site-packages/omegaconf/dictconfig.py", line 305, in __getattr__
self._format_and_raise(key=key, value=None, cause=e)
File "/usr/local/lib/python3.8/site-packages/omegaconf/base.py", line 95, in _format_and_raise
format_and_raise(
File "/usr/local/lib/python3.8/site-packages/omegaconf/_utils.py", line 629, in format_and_raise
_raise(ex, cause)
File "/usr/local/lib/python3.8/site-packages/omegaconf/_utils.py", line 610, in _raise
raise ex # set end OC_CAUSE=1 for full backtrace
File "/usr/local/lib/python3.8/site-packages/omegaconf/dictconfig.py", line 303, in __getattr__
return self._get_impl(key=key, default_value=DEFAULT_VALUE_MARKER)
File "/usr/local/lib/python3.8/site-packages/omegaconf/dictconfig.py", line 361, in _get_impl
node = self._get_node(key=key)
File "/usr/local/lib/python3.8/site-packages/omegaconf/dictconfig.py", line 383, in _get_node
self._validate_get(key)
File "/usr/local/lib/python3.8/site-packages/omegaconf/dictconfig.py", line 135, in _validate_get
self._format_and_raise(
File "/usr/local/lib/python3.8/site-packages/omegaconf/base.py", line 95, in _format_and_raise
format_and_raise(
File "/usr/local/lib/python3.8/site-packages/omegaconf/_utils.py", line 694, in format_and_raise
_raise(ex, cause)
File "/usr/local/lib/python3.8/site-packages/omegaconf/_utils.py", line 610, in _raise
raise ex # set end OC_CAUSE=1 for full backtrace
omegaconf.errors.ConfigAttributeError: Key 'multi_corpus_keys' is not in struct
full_key: task.multi_corpus_keys
reference_type=Any
object_type=dict
Traceback (most recent call last):
File "examples/mms/asr/infer/mms_infer.py", line 52, in <module>
process(args)
File "examples/mms/asr/infer/mms_infer.py", line 44, in process
with open(tmpdir/"hypo.word") as fr:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp4o9kxdyr/hypo.word'
Same error.
$ python examples/mms/asr/infer/mms_infer.py --model /idiap/temp/esarkar/cache/fairseq/mms1b_all.pt --lang shp --audio /idiap/temp/esarkar/Data/shipibo/downsampled_single_folder/short/shp-ROS-2022-03-14-2.1.wav
>>> preparing tmp manifest dir ...
>>> loading model & running inference ...
Traceback (most recent call last):
File "/remote/idiap.svm/user.active/esarkar/speech/fairseq/examples/speech_recognition/new/infer.py", line 21, in <module>
from examples.speech_recognition.new.decoders.decoder_config import (
File "/remote/idiap.svm/user.active/esarkar/speech/fairseq/examples/speech_recognition/__init__.py", line 1, in <module>
from . import criterions, models, tasks # noqa
File "/remote/idiap.svm/user.active/esarkar/speech/fairseq/examples/speech_recognition/criterions/__init__.py", line 15, in <module>
importlib.import_module(
File "/idiap/temp/esarkar/miniconda/envs/fairseq/lib/python3.9/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/remote/idiap.svm/user.active/esarkar/speech/fairseq/examples/speech_recognition/criterions/cross_entropy_acc.py", line 13, in <module>
from fairseq import utils
File "/remote/idiap.svm/user.active/esarkar/speech/fairseq/fairseq/__init__.py", line 33, in <module>
import fairseq.criterions # noqa
File "/remote/idiap.svm/user.active/esarkar/speech/fairseq/fairseq/criterions/__init__.py", line 18, in <module>
(
TypeError: cannot unpack non-iterable NoneType object
Traceback (most recent call last):
File "/remote/idiap.svm/user.active/esarkar/speech/fairseq/examples/mms/asr/infer/mms_infer.py", line 52, in <module>
process(args)
File "/remote/idiap.svm/user.active/esarkar/speech/fairseq/examples/mms/asr/infer/mms_infer.py", line 44, in process
with open(tmpdir/"hypo.word") as fr:
FileNotFoundError: [Errno 2] No such file or directory: '/idiap/temp/esarkar/tmp/tmpnhi5rrui/hypo.word'
Same issue.
python examples/mms/asr/infer/mms_infer.py --model "models/mms1b_fl102.pt" --lang eng --audio "../testscripts/audio.wav"
>>> preparing tmp manifest dir ...
>>> loading model & running inference ...
Traceback (most recent call last):
File "~/fairseq/examples/speech_recognition/new/infer.py", line 21, in <module>
from examples.speech_recognition.new.decoders.decoder_config import (
File "~/fairseq/examples/__init__.py", line 7, in <module>
from fairseq.version import __version__ # noqa
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "~/fairseq/fairseq/__init__.py", line 20, in <module>
from fairseq.distributed import utils as distributed_utils
File "~/fairseq/fairseq/distributed/__init__.py", line 7, in <module>
from .fully_sharded_data_parallel import (
File "~/fairseq/fairseq/distributed/fully_sharded_data_parallel.py", line 10, in <module>
from fairseq.dataclass.configs import DistributedTrainingConfig
File "~/fairseq/fairseq/dataclass/__init__.py", line 6, in <module>
from .configs import FairseqDataclass
File "~/fairseq/fairseq/dataclass/configs.py", line 1127, in <module>
@dataclass
^^^^^^^^^
File "<location>/opt/anaconda3/envs/mms/lib/python3.11/dataclasses.py", line 1223, in dataclass
return wrap(cls)
^^^^^^^^^
File "<location>/opt/anaconda3/envs/mms/lib/python3.11/dataclasses.py", line 1213, in wrap
return _process_class(cls, init, repr, eq, order, unsafe_hash,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<location>/opt/anaconda3/envs/mms/lib/python3.11/dataclasses.py", line 958, in _process_class
cls_fields.append(_get_field(cls, name, type, kw_only))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<location>/opt/anaconda3/envs/mms/lib/python3.11/dataclasses.py", line 815, in _get_field
raise ValueError(f'mutable default {type(f.default)} for field '
ValueError: mutable default <class 'fairseq.dataclass.configs.CommonConfig'> for field common is not allowed: use default_factory
Traceback (most recent call last):
File "~/fairseq/examples/mms/asr/infer/mms_infer.py", line 52, in <module>
process(args)
File "~/fairseq/examples/mms/asr/infer/mms_infer.py", line 44, in process
with open(tmpdir/"hypo.word") as fr:
^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/var/folders/7r/6k64fzpn6sx5ml6pb2h67kbw0000gn/T/tmp9ubxk363/hypo.word'
Hello Everyone,
I have done all the remaining dependencies done. and when i m printing the output of error it shows Cannot Unpack NoneType object.
Here is the full log :
>>> preparing tmp manifest dir ...
>>> loading model & running inference ...
Traceback (most recent call last):
File "examples/speech_recognition/new/infer.py", line 21, in <module>
from examples.speech_recognition.new.decoders.decoder_config import (
File "/root/VITS/fairseq/examples/speech_recognition/__init__.py", line 1, in <module>
from . import criterions, models, tasks # noqa
File "/root/VITS/fairseq/examples/speech_recognition/criterions/__init__.py", line 15, in <module>
importlib.import_module(
File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/root/VITS/fairseq/examples/speech_recognition/criterions/cross_entropy_acc.py", line 13, in <module>
from fairseq import utils
File "/root/VITS/fairseq/fairseq/__init__.py", line 33, in <module>
import fairseq.criterions # noqa
File "/root/VITS/fairseq/fairseq/criterions/__init__.py", line 18, in <module>
(
TypeError: cannot unpack non-iterable NoneType object
CompletedProcess(args='\n PYTHONPATH=. PREFIX=INFER HYDRA_FULL_ERROR=1 python examples/speech_recognition/new/infer.py -m --config-dir examples/mms/asr/config/ --config-name infer_common decoding.type=viterbi dataset.max_tokens=4000000 distributed_training.distributed_world_size=1 "common_eval.path=\'mms1b_fl102.pt\'" task.data=/tmp/tmptsuf4ig2 dataset.gen_subset="hin:dev" common_eval.post_process=letter decoding.results_path=/tmp/tmptsuf4ig2\n ', returncode=1)
Traceback (most recent call last):
File "examples/mms/asr/infer/mms_infer.py", line 53, in <module>
process(args)
File "examples/mms/asr/infer/mms_infer.py", line 45, in process
with open(tmpdir/"hypo. Word") as fr:
I m running on ubuntu 20.04
Hi, thanks for this discussion - I've learned a lot. This is the Dockerfile I created after a few hours trying to make it work:
FROM python:3.8 WORKDIR /usr/src/app COPY . . RUN pip install --no-cache-dir . \ && pip install --no-cache-dir soundfile \ && pip install --no-cache-dir torch \ && pip install --no-cache-dir hydra-core \ && pip install --no-cache-dir editdistance \ && pip install --no-cache-dir soundfile \ && pip install --no-cache-dir omegaconf \ && pip install --no-cache-dir scikit-learn \ && pip install --no-cache-dir tensorboardX \ && python setup.py build_ext --inplace \ && apt update \ && apt -y install libsndfile-dev \ && rm -rf /var/lib/apt/lists/* \ && wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/bin/yq \ && chmod +x /usr/bin/yq \ && yq -i '.common.cpu = true' examples/mms/asr/config/infer_common.yaml CMD [ "python", "examples/mms/asr/infer/mms_infer.py" ]I built the image with:
docker build -t fairseq:dev .And run it with:
docker run --rm -it -e USER=root -v $(pwd):/mms:ro fairseq:dev python examples/mms/asr/infer/mms_infer.py --model /mms/mms1b_fl102.pt --lang eng --audio /mms/audio.wav
Could you please run the code again, starting from git clone, Many thanks
- git clone https://github.com/facebookresearch/fairseq
- cd fairseq
- wget <model>
- docker build -t fairseq:dev .
- docker run --rm -it -e USER=root -v $(pwd):/mms:ro fairseq:dev python examples/mms/asr/infer/mms_infer.py --model /mms/mms1b_fl102.pt --lang eng --audio /mms/audio.wav