Vista issues

diffusion loss

1

Hi, thank you for your excellent work! In Vista, is the diffusion loss constructed by comparing the denoised results against the original latents rather than between the predicted noise and...

Gaoeee

CUDA out of memory on a 40G GPU

16

Following `docs/ISSUES.md` and `docs/SAMPLING.md` set, but still out of memory. Here's my config and instruction In `configs/inference/vista.yaml`, change `en_and_decode_n_samples_a_time` to `1` ```python model: target: vwm.models.diffusion.DiffusionEngine params: input_key: img_seq scale_factor: 0.18215...

SunYue98

enhancement

Channel-wise latent prior stronger than dynamic latent priors?

1

Apologies for all of the questions. Instead of repeating and concatenating the **initial** frame to each latent, did you attempt to instead concatenate the **final** frame of the dynamic priors?...

jmonas

some questions about the data processing

1

I have some questions about the dataset processing. In the NUS dataset, each frame of the photo should correspond to the speed, trajectory, angle, yaw, position, etc., of each frame....

unicoco7

coefficients λ1 and λ2 values

1

"The coefficients λ1 and λ2 in Eq. (6) are set to 1.0 and 0.1 respectively." from section C.3 in the paper. How did you settle on these coefficient values?

jmonas

Have you considered putting it in carla for closed-loop testing?

2

Have you considered putting it in carla for closed-loop testing?

hunkyu

How to load the pretrained safesensor and continue to train?

8

Hello, Thanks for your sharing code! I am now try to train the [stage 2](https://github.com/OpenDriveLab/Vista/blob/main/docs/TRAINING.md#stage-2-high-resolution-training) with the provided [vista.safetensors](https://huggingface.co/OpenDriveLab/Vista/blob/main/vista.safetensors) So I change the command to below: ``` torchrun \ --nnodes=1...

JunyuanDeng

How to convert bin to safetensors when I reload the original safetensors

10

I am now try to train the [stage 2](https://github.com/OpenDriveLab/Vista/blob/main/docs/TRAINING.md#stage-2-high-resolution-training) with the provided [vista.safetensors](https://huggingface.co/OpenDriveLab/Vista/blob/main/vista.safetensors). After training, I merged the partitioned checkpoints as pytorch_model.bin using zero_to_fp32.py and while I use bin_to_st.py to...

JunyuanDeng

bug

Any plan to evaluation code？

17

Thank you very much for your exciting work. Do you have any plan to release evaluation code corresponding to Table 2?

ILOFI

Any plan to support fp16 / bf16 support?

3

Thank you for an amazing work! I was trying to train Vista model with OpenDV-YouTube dataset (this is also a great work, thanks!) and found out that OOM sometimes happens...

koukyo1994

Vista
Vista copied to clipboard

Metadata

diffusion loss

CUDA out of memory on a 40G GPU

Channel-wise latent prior stronger than dynamic latent priors?

some questions about the data processing

coefficients λ1 and λ2 values

Have you considered putting it in carla for closed-loop testing?

How to load the pretrained safesensor and continue to train?

How to convert bin to safetensors when I reload the original safetensors

Any plan to evaluation code？

Any plan to support fp16 / bf16 support?

← Metadata

Owner

Metadata

Vista Vista copied to clipboard

Metadata

← Metadata

Owner

Metadata

Vista
Vista copied to clipboard