SGNify icon indicating copy to clipboard operation
SGNify copied to clipboard

Error in trust_region_newton_cg.py

Open athenas-lab opened this issue 2 years ago • 6 comments

Hi, I am getting the following error when call_sgnify() is called in the stage before the video is generated. All of the previous stages passed without errors. Please let me know how I can fix this. Thanks.

 Running SGNify...
      0%|                                                                                                                                                               | 1/1380 [02:57<68:06:07, 177.79s/it]
    Traceback (most recent call last):
      File "SGNify/smplifyx/main.py", line 261, in <module>
        main(**args)
      File "SGNify/smplifyx/main.py", line 232, in main
        fit_single_frame(img, keypoints[[0]],
      File "SGNify/smplifyx/fit_single_frame.py", line 650, in fit_single_frame
        final_loss_val = monitor.run_fitting(
      File "SGNify/smplifyx/fitting.py", line 176, in run_fitting
        loss = optimizer.step(closure)
      File "/home/ubuntu/envs/sgnify/lib/python3.10/site-packages/torch/optim/optimizer.py", line 280, in wrapper
        out = func(*args, **kwargs)
      File "SGNify/smplifyx/optimizers/trust_region_newton_cg.py", line 301, in step
        param_step, hit_boundary = self._solve_trust_reg_subproblem(
      File "/home/ubuntu/envs/sgnify/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
        return func(*args, **kwargs)
      File "SGNify/smplifyx/optimizers/trust_region_newton_cg.py", line 277, in _solve_trust_reg_subproblem
        raise RuntimeError
    RuntimeError
      0%|                                                                                                                                                              | 1/1380 [05:15<120:44:17, 315.20s/it]
    Traceback (most recent call last):
      File "SGNify/sgnify.py", line 513, in <module>
        main(args)
      File "SGNify/sgnify.py", line 474, in main
        run_sgnify(
      File "SGNify/sgnify.py", line 251, in run_sgnify
        call_sgnify(
      File "SGNify/sgnify.py", line 138, in call_sgnify
        return run(
      File "/home/ubuntu/envs/sgnify/lib/python3.10/subprocess.py", line 526, in run
        raise CalledProcessError(retcode, process.args,
    subprocess.CalledProcessError: Command '['python', 'smplifyx/main.py', '--config', 'cfg_files/fit_sgnifyx_sv.yaml', '--output_folder', PosixPath('SGNify/results/test_video'), '--data_folder', PosixPath('SGNify/results/test_video/.tmp/tmp/data'), '--prev_res_path', PosixPath('SGNify/results/test_video/results/001.pkl'), '--expression_precomputed', 'True', '--expression_path', PosixPath('SGNify/results/test_video/.tmp/spectre/spectre_2.pkl'), '--use_symmetry', 'False', '--symmetry_weight', '18.0', '--left_handpose_path', 'None', '--left_reference_weight', '18.0', '--right_handpose_path', PosixPath('SGNify/results/test_video/.tmp/rps/interp_right/001.pkl'), '--right_reference_weight', '18.0', '--beta_precomputed', 'True']' returned non-zero exit status 1.

athenas-lab avatar Jul 01 '23 02:07 athenas-lab

Yes +1 I am now also facing this issue!!!

Do you also experience that sometimes it is able to run some frames before crashing?

And have you fixed it, Im trying right now ;)

AIMads avatar Jul 04 '23 20:07 AIMads

Seems like it is in the loss calculation things break

AIMads avatar Jul 04 '23 21:07 AIMads

It just worked on another input video of mine, so it is dependent on what it is processing if it will work on not, best guess so far is maybe if loss becomes zero og is a negative value then everything collapse, but I don't know yet

AIMads avatar Jul 05 '23 06:07 AIMads

call_sgnify_0() which optimizes the first frame works without error.

But call_signify() which optimizes the subsequent frames generates the error above. The error is caused when the optimizer is set to trustnewcg.

I tried setting the optimizer in fit_sgnifyx_sv.yaml to "lbfgs", which is what the original smplifyx code uses. But this results in NaN for the body loss value (final_loss_val) in fit_single_frame() and the resulting meshes are invalid.

I also tried setting the optimizer to "adam". This does not cause any error and produces meshes but the poses in the meshes do not match the original video.

If TrustNewtonCG is the right optimizer, then I would appreciate help with fixing the error when solve_trust_reg_subproblem() is called by call_sgnify(). Thanks.

athenas-lab avatar Jul 05 '23 21:07 athenas-lab

I am also facing the same issue. Did you find a solution? Thanks

Daksitha avatar Jul 12 '24 15:07 Daksitha

The problem could be that the weights are too strong, try reducing the weights in the config file (in particular, the weights related to the temporal loss. Ideally, these weights should be adapted to the input fps, but we did not implement this)

MPForte avatar Nov 01 '24 17:11 MPForte