Joon
Joon
In my opinion, authors define L_{vlb} = L_0 + ... + L_T, not L_t. Thus, they may calculate the vlb loss with scale factor T (self.num_timestep).
In my opinion, when you predict the x_start in t \approx T, with a cosine noise schedule, bar alphas (cumprod alphas) have very small values compared to linear noise schedules....
When I didn't use the DistributedSampler in the dataloader, this problem became less severe. But this problem remained.
I followed this post (https://ppwwyyxx.com/blog/2022/Demystify-RAM-Usage-in-Multiprocess-DataLoader/ ) In sam3/train/data/coco_json_loaders.py, we can add "TorchSerializedList" and modify load_coco_and_group_by_image func. ```python class TorchSerializedList: """ Alternative implementation using torch.Tensor for spawn/forkserver mode. torch.Tensor can be...
I guess the problem might be "_target_: sam3.train.transforms.segmentation.DecodeRle". According to your json example, the segmentation is not Rle format. You can convert them into Rle format.
Perhaps, You can manually provide the input points and input bboxes in COCO_FROM_JSON class of coco_json_loader.py (if you train the model using image.) See the "loadQueriesAndAnnotationsFromDatapoint" in COCO_FROM_JSON. Or, if...
I guess you can only fine-tune the detector and the shared backbone. According to the current implementation, it appears that training code for the tracker modules is not provided. You...
@svengoluza You can manually turn off the requires_grad setting in the Trainer where the model is instantiated.
@svengoluza For me, yes.
I'm sorry. I wrongly pre-processed my data. The wrong part was converting the data into COCO format. Thank you for the reply!