About inferencing on real video
Dear author, I have a video recorded from camera mounted on car. How can I use UniAD model to run inference on this video?
Dear @YTEP-ZHI @Yihanhu @faikit, can you kindly give us some recommendations for this? I saw a very old open issue which still had no reply. Best regards ./.
@BaophanN It involves heavy engineering work. The straightforward way is to align your data with the format of the nuScenes dataset.
@ilnehc Thanks for your response. As far as I know, UniAD is vision-based, however, the input of the model also requires nuscenes ego2global rotation and translation. If I have a sequence of images only, how can I obtain good prediction from the model without retraining on new data? Is there a way I can get the simliar ego2global info given only my video input. Thank you ./.
@BaophanN Driving in a 3D world requires intrinsics and extrinsics at least. It is not as easy as tasks like 2D detection. You may try Structure from Motion methods to obtain them if they are not recorded.
Thank you for your valuable response. May I ask this very last question? I did use Structure from motion (colmap) to get the poses that you passed to the UniAD model. However, what is the difference between the ego2global pose from nuscenes vs world2cam pose from structure from motion model?