training_extensions icon indicating copy to clipboard operation
training_extensions copied to clipboard

failures on Custom_Object_Detection_YOLOX with tiling

Open yunchu opened this issue 2 years ago • 0 comments

Got failures from the regression tests for the detetion task with tiling

all failures are coming from Custom_Object_Detection_YOLOX template.

================================================ short test summary info =================================================                                                                                                                        FAILED tests/regression/detection/test_tiling_detection.py::TestRegressionTilingDetection::test_otx_train[Custom_Object_Detection_YOLOX] - AssertionError: Traceback (most recent call last):
FAILED tests/regression/detection/test_tiling_detection.py::TestRegressionTilingDetection::test_otx_train_kpi_test[Custom_Object_Detection_YOLOX] - ValueError: Performance is None.
FAILED tests/regression/detection/test_tiling_detection.py::TestRegressionTilingDetection::test_otx_export_eval_openvino[Custom_Object_Detection_YOLOX] - AssertionError: Traceback (most recent call last):
FAILED tests/regression/detection/test_tiling_detection.py::TestRegressionTilingDetection::test_otx_deploy_eval_deployment[Custom_Object_Detection_YOLOX] - AssertionError: Traceback (most recent call last):
FAILED tests/regression/detection/test_tiling_detection.py::TestRegressionTilingDetection::test_nncf_optimize_eval[Custom_Object_Detection_YOLOX] - AssertionError: Traceback (most recent call last):
FAILED tests/regression/detection/test_tiling_detection.py::TestRegressionTilingDetection::test_pot_optimize_eval[Custom_Object_Detection_YOLOX] - AssertionError: Traceback (most recent call last):

here is some error messages from the test log

E       AssertionError: Traceback (most recent call last):
E         File "/home/yunchu/miniconda3/envs/py310/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
E           self.run()                                                                                                                                                                                                                            E         File "/home/yunchu/miniconda3/envs/py310/lib/python3.10/multiprocessing/process.py", line 108, in run
E           self._target(*self._args, **self._kwargs)                                                                                                                                                                                             E         File "/mnt/hdd1/workspace/training_extensions/otx/cli/utils/multi_gpu.py", line 260, in run_child_process
E           train_func()
E         File "/mnt/hdd1/workspace/training_extensions/otx/cli/tools/train.py", line 265, in train
E           task.train(                                                                                                                                                                                                                           E         File "/mnt/hdd1/workspace/training_extensions/otx/algorithms/detection/task.py", line 210, in train                                                                                                                                     E           results = self._train_model(dataset)                                                                                                                                                                                                  E         File "/mnt/hdd1/workspace/training_extensions/otx/algorithms/detection/adapters/mmdet/task.py", line 295, in _train_model                                                                                                               E           train_detector(                                                                                                                                                                                                                       E         File "/mnt/hdd1/workspace/training_extensions/.tox/tests-all-py310/lib/python3.10/site-packages/mmdet/apis/train.py", line 246, in train_detector
E           runner.run(data_loaders, cfg.workflow)                                                                                                                                                                                                E         File "/mnt/hdd1/workspace/training_extensions/.tox/tests-all-py310/lib/python3.10/site-packages/mmcv/runner/epoch_based_runner.py", line 136, in run
E           epoch_runner(data_loaders[i], **kwargs)
E         File "/mnt/hdd1/workspace/training_extensions/otx/algorithms/common/adapters/mmcv/runner.py", line 78, in train
E           for i, data_batch in enumerate(self.data_loader):
E         File "/mnt/hdd1/workspace/training_extensions/.tox/tests-all-py310/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 628, in __next__
E           data = self._next_data()
E         File "/mnt/hdd1/workspace/training_extensions/.tox/tests-all-py310/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1333, in _next_data
E           return self._process_data(data)
E         File "/mnt/hdd1/workspace/training_extensions/.tox/tests-all-py310/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1359, in _process_data
E           data.reraise()
E         File "/mnt/hdd1/workspace/training_extensions/.tox/tests-all-py310/lib/python3.10/site-packages/torch/_utils.py", line 543, in reraise
E           raise exception
E       IndexError: Caught IndexError in DataLoader worker process 0.
E       Original Traceback (most recent call last):
E         File "/mnt/hdd1/workspace/training_extensions/.tox/tests-all-py310/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
E           data = fetcher.fetch(index)
E         File "/mnt/hdd1/workspace/training_extensions/.tox/tests-all-py310/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch
E           data = [self.dataset[idx] for idx in possibly_batched_index]
E         File "/mnt/hdd1/workspace/training_extensions/.tox/tests-all-py310/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 58, in <listcomp>
E           data = [self.dataset[idx] for idx in possibly_batched_index]
E         File "/mnt/hdd1/workspace/training_extensions/otx/algorithms/detection/adapters/mmdet/datasets/dataset.py", line 388, in __getitem__
E           return self.pipeline(self.tile_dataset[idx])
E         File "/mnt/hdd1/workspace/training_extensions/otx/algorithms/detection/adapters/mmdet/datasets/tiling.py", line 374, in __getitem__
E           result = copy.deepcopy(self.tiles[idx])
E       IndexError: list index out of range

Steps to Reproduce

  1. prepare dataset to /mnt/hdd1/data/ci_datasets
  2. CI_DATA_ROOT=/mnt/hdd1/data/ci_datasets tox -e tests-all-py310 -- tests/regression/detection/test_tiling_detection.py

Environment:

  • OS: Ubuntu 20.04
  • Framework version: used tox testenv
  • Python version: 3.10
  • OpenVINO version: used tox testenv
  • CUDA/cuDNN version: 11.7.1
  • GPU model and memory: 24G

yunchu avatar Jun 23 '23 05:06 yunchu