Training is too slow. WHY?
Two V100 GPU is used. But average gpu usage is about 30%。
2023-06-19 11:38:06,489 oprah_ms_istft_vits INFO ====> Epoch: 1 2023-06-19 11:39:16,343 oprah_ms_istft_vits INFO ====> Epoch: 2 2023-06-19 11:40:11,469 oprah_ms_istft_vits INFO ====> Epoch: 3 2023-06-19 11:41:02,040 oprah_ms_istft_vits INFO ====> Epoch: 4 2023-06-19 11:41:49,471 oprah_ms_istft_vits INFO ====> Epoch: 5 2023-06-19 11:42:33,887 oprah_ms_istft_vits INFO ====> Epoch: 6 2023-06-19 11:42:52,609 oprah_ms_istft_vits INFO Train Epoch: 7 [6%] 2023-06-19 11:42:52,609 oprah_ms_istft_vits INFO [1.471453309059143, 3.4994378089904785, 6.886540412902832, 35.85108947753906, 1.053851842880249, 1.9440194368362427, 0.0, 200, 0.0001998500468671882] 2023-06-19 11:43:17,986 oprah_ms_istft_vits INFO ====> Epoch: 7 2023-06-19 11:43:58,944 oprah_ms_istft_vits INFO ====> Epoch: 8 2023-06-19 11:44:40,284 oprah_ms_istft_vits INFO ====> Epoch: 9 2023-06-19 11:45:29,205 oprah_ms_istft_vits INFO ====> Epoch: 10 2023-06-19 11:46:11,159 oprah_ms_istft_vits INFO ====> Epoch: 11 2023-06-19 11:46:53,317 oprah_ms_istft_vits INFO ====> Epoch: 12 2023-06-19 11:47:11,451 oprah_ms_istft_vits INFO Train Epoch: 13 [12%]
The same.
I'm not an expert, but did you try changing the batch size?