Charles Lu comments

Repositories
Issues
Comments

Results 3 comments of


                                            Charles Lu

Accpetable Size of Images for CLIP

You should upsample the image to (224, 224). ResNet performs 32x downsampling, and ViT also needs fixed-size input to patchify. So smaller images will cause problems here and there

The transformation matrix between cameras in TAPVid-3D

Dear Authors, Thank you for your exceptional work and this wonderful dataset! I have a similar question: based on my understanding, the released 3D trajectories of key points are in...

Meaning of the annotation_dict entry of CoTracker3_Kubric data

The values in annot_dict["depth"] look similar to the data used here: https://github.com/google-research/kubric/blob/0ee21e2a723b2131123d67e55d1f65b6d0e6cf0f/challenges/point_tracking/dataset.py#L536-L540