TimeCycle icon indicating copy to clipboard operation
TimeCycle copied to clipboard

Questions about the training from scratch

Open gonglixue opened this issue 6 years ago • 9 comments

Hi. I used the provided code to train TimeCycle on some other video datasets. Finetuning the network with the provided checkpoint_14.pth.tar works fine. But when I training the network from scratch, both the inlier loss and theta loss did not decrease. Is there any training tips when training TimeCycle from scratch?

gonglixue avatar Sep 15 '19 10:09 gonglixue

@gonglixue - when you visualised your training results, did you ever get a blocky output for your visualization? We're running into similar problems, and I'm wondering if this is a problem for you too

JBKnights avatar Sep 23 '19 05:09 JBKnights

same problem... :(

roeez avatar Sep 23 '19 11:09 roeez

I have trained it from scratch successfully : ) Firstly set detach_network=True in model_simple.py, which means freezing the feature extractor. And then set detach_network=False to train the whole network end-to-end.

gonglixue avatar Sep 23 '19 11:09 gonglixue

Thanks! i will try

roeez avatar Sep 23 '19 11:09 roeez

@gonglixue - when you visualised your training results, did you ever get a blocky output for your visualization? We're running into similar problems, and I'm wondering if this is a problem for you too

I didn't come across the blocky output problem. Using the code in transformation.py to transform an image with a given affine matrix works correctly.

gonglixue avatar Sep 23 '19 11:09 gonglixue

can you please provide me more details? did you set also can_detach=True in forward_base method? first you detach the encoder and train the transformation network for few epochs and then you set detach_network=False and train more epochs? The optimizer and the optimizer settings are as stated in the paper?

my loss_targ_theta_skip is very noisy and the back_inliers is vanishing very early...

Thanks :)

roeez avatar Sep 23 '19 12:09 roeez

can you please provide me more details? did you set also can_detach=True in forward_base method? first you detach the encoder and train the transformation network for few epochs and then you set detach_network=False and train more epochs? The optimizer and the optimizer settings are as stated in the paper?

my loss_targ_theta_skip is very noisy and the back_inliers is vanishing very early...

Thanks :)

My full training process is as follow:

  1. Completely detach the feature extractor. That means https://github.com/xiaolonw/TimeCycle/blob/16d33ac0fb0a08105a9ca781c7b1b36898e3b601/models/videos/model_simple.py#L166 is always True
# detach_network=True in __init__()
# if self.detach_network and can_detach:
if self.detach_network:
    x_pre = x_pre.detach()

In this step, I set lamda=0.3, lr=2e-4. And the inlier loss only decrease a little bit.

  1. After step-1 converges, set detach_network=False to train the whole network and everything other is the same as original code.
if self.detach_netwrok and can_detach:
    x_pre = x_pre.detach()

In this step, I find that the theta loss almost converges while the inlier loss decreases slowly. So decrease the weight of theta loss with lamda=0.1 and use a larger learning rate (lr=3e-4)

  1. Use a smaller learning rate (lr=2e-4, lamda=0.1) to finetune.

My training process seems a little complicated. For some video data, I have to adjust the hyper parameters back and forth...

gonglixue avatar Sep 23 '19 12:09 gonglixue

Thank you very much for the detailed answer, you are great!

roeez avatar Sep 23 '19 12:09 roeez

Thanks so much for the help! Out of curiosity how many epochs did each of the steps take? i.e. How much training did you do before you unfroze the feature extractor?

JBKnights avatar Sep 24 '19 00:09 JBKnights