Vista icon indicating copy to clipboard operation
Vista copied to clipboard

[NeurIPS 2024] A Generalizable World Model for Autonomous Driving

Results 35 Vista issues
Sort by recently updated
recently updated
newest added

Hi, thank you for your excellent work! In Vista, is the diffusion loss constructed by comparing the denoised results against the original latents rather than between the predicted noise and...

Following `docs/ISSUES.md` and `docs/SAMPLING.md` set, but still out of memory. Here's my config and instruction In `configs/inference/vista.yaml`, change `en_and_decode_n_samples_a_time` to `1` ```python model: target: vwm.models.diffusion.DiffusionEngine params: input_key: img_seq scale_factor: 0.18215...

enhancement

Apologies for all of the questions. Instead of repeating and concatenating the **initial** frame to each latent, did you attempt to instead concatenate the **final** frame of the dynamic priors?...

I have some questions about the dataset processing. In the NUS dataset, each frame of the photo should correspond to the speed, trajectory, angle, yaw, position, etc., of each frame....

"The coefficients λ1 and λ2 in Eq. (6) are set to 1.0 and 0.1 respectively." from section C.3 in the paper. How did you settle on these coefficient values?

Have you considered putting it in carla for closed-loop testing?

Hello, Thanks for your sharing code! I am now try to train the [stage 2](https://github.com/OpenDriveLab/Vista/blob/main/docs/TRAINING.md#stage-2-high-resolution-training) with the provided [vista.safetensors](https://huggingface.co/OpenDriveLab/Vista/blob/main/vista.safetensors) So I change the command to below: ``` torchrun \ --nnodes=1...

I am now try to train the [stage 2](https://github.com/OpenDriveLab/Vista/blob/main/docs/TRAINING.md#stage-2-high-resolution-training) with the provided [vista.safetensors](https://huggingface.co/OpenDriveLab/Vista/blob/main/vista.safetensors). After training, I merged the partitioned checkpoints as pytorch_model.bin using zero_to_fp32.py and while I use bin_to_st.py to...

bug

Thank you very much for your exciting work. Do you have any plan to release evaluation code corresponding to Table 2?

Thank you for an amazing work! I was trying to train Vista model with OpenDV-YouTube dataset (this is also a great work, thanks!) and found out that OOM sometimes happens...