Suggestions for bad temporal inference and illogical spatial assignments within the scene

Open deepsworld opened this issue 2 years ago • 1 comments

Thank you for sharing your great work. We are trying to use the ReST code and framework on MMPTrack dataset which has fully overlapping camera views and larger training data size than Wildtrack. The training and inference code runs fine but the results are not good. We did not modify the code other than the dataset to allow it to work with this custom dataset. Can you please provide some suggestions or advice on the following:

Frame wise tracking in single camera view results in tracks that change almost every frame.
Spatial assignments across cameras somewhat works but assigns same track ID to multiple person in the same camera view.

Please find 5 annotated frame sequence (frame #21-25, each image contains the grid of all views) from the MMPTrack pretrained model for your reference. We use the same config as Wildtrack and the model predictions and training are with groundtruth boxes

grid_21 grid_22 grid_23 grid_24 grid_25

Thanks, Deep

Apr 30 '24 23:04 deepsworld

Thanks for your interest in our research. Our model is mostly rely on the feature of geometry position as claimed in the paper. You need to make sure the projection of the same person is close enough while the projection of different people is far away by refining the homography matrix instead of using homography from other datasets. It may not work when the region is too small and crowded.

Best

May 04 '24 15:05 chengche6230