tapnet icon indicating copy to clipboard operation
tapnet copied to clipboard

Training on a new dataset

Open chandlj opened this issue 2 years ago • 3 comments

I am looking to train TAP-Net on a new dataset, in particular modifying the existing datasets (DAVIS, Kubric, RGB Stacking, etc.) to use new keypoints that we generate. It is not immediately clear to me how we should add a new dataset, and how to use the existing scripts such as experiment.py in order to train TAP-Net on the new dataset.

It looks like Kubric is the only dataset that is supported for training, whereas DAVIS and RGB Stacking are included for inference. Could you walk me through what format TAP-Net expects from a dataset, and where in the code/config I would need to add functionality in order to use the new dataset?

chandlj avatar Apr 13 '23 17:04 chandlj

Kubric uses the same format as the other datasets: a set of query_points in t,y,x format (shape [batch, num_points, 3]), a set of target_points in x,y format (shape [batch, num_points, num_frames, 2]), and an binary occlusion flag (1 if occluded, 0 otherwise; shape [batch, num_points, num_frames]), and the video (scaled between -1 and 1 shape [batch, num_frames, height, width, 3]). To train on something else, you'd need to edit experiment.py to add a new dataset_constructor which will return a new python generator that generates dicts containing the above fields. Then you need to edit the config file to add the desired dataset name and its kwargs to datasets. That is, once you've written the generator, it should only be a few lines of code.

Note that the code is set up for multi-dataset training, meaning that the training class will receive a dict keyed by dataset name, with an example from each dataset. You may need to change the input_key in supervised_point_prediction.py to get it to use the correct dataset.

cdoersch avatar Apr 21 '23 10:04 cdoersch

Thanks for the reply, that worked well for me. What config parameters did you set for finetuning, such that I could finetune the existing model on the new dataset I added? Right now I'm using the checkpoint.npy file that is available in the README, but that already initializes to having run 100k steps. In my config, I have to set the number of training_steps to be something >100k to get it to run. In the paper you mention that for finetuning you ran 5000 steps with 100 warmup steps and a learning rate of 1e-5, how do I set those parameters to be compatible with the existing checkpoint.npy file? Did you change the weight decay parameters at all?

chandlj avatar May 21 '23 19:05 chandlj

Can I use my personal dataset that is not related to any of the ones mentioned above?

aloma85 avatar Jul 12 '23 05:07 aloma85