Open-Sora icon indicating copy to clipboard operation
Open-Sora copied to clipboard

how to determine the num_frames given the time length?

Open Edwardmark opened this issue 1 year ago • 2 comments

how to determine the num_frames given the time length? I want to use the config num_frames to generate video, how to calculate the num_frames when given time length t and fps? It seems pretty complex in the code. Could you please give me a simple equation? Thanks.

Edwardmark avatar Jun 24 '24 02:06 Edwardmark

Specifically, when I set num_frames to 68, the t in stdit_block is 20, I cannot understand why is that. self.micro_frame_size = 17, why we need to use this setting 17? in vae_temporal.py line 371, why we use input[0]=17, and use padding?


 def get_latent_size(self, input_size):
        latent_size = []
        for i in range(3):
            if input_size[i] is None:
                lsize = None
            elif i == 0:
                time_padding = (
                    0
                    if (input_size[i] % self.time_downsample_factor == 0)
                    else self.time_downsample_factor - input_size[i] % self.time_downsample_factor
                )
                lsize = (input_size[i] + time_padding) // self.patch_size[i]
            else:
                lsize = input_size[i] // self.patch_size[i]
            latent_size.append(lsize)
        return latent_size

Could you please explain a bit?Thanks. @sarroutbi @vjandrea @mahone3297 @duguyixiaono1

Edwardmark avatar Jun 24 '24 02:06 Edwardmark

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] avatar Jul 02 '24 01:07 github-actions[bot]

This issue was closed because it has been inactive for 7 days since being marked as stale.

github-actions[bot] avatar Jul 09 '24 01:07 github-actions[bot]

do u find why?

henbucuoshanghai avatar Jul 18 '24 09:07 henbucuoshanghai

do u find why?

no, it is so strange to set to 17.

Edwardmark avatar Aug 28 '24 02:08 Edwardmark

@sarroutbi @vjandrea @mahone3297 @duguyixiaono1 could you explain a bit? Thanks.

Edwardmark avatar Aug 28 '24 02:08 Edwardmark

@zhengzangw could you give me a hint why we set vae micro-batch to 17? When I change it to 16, the video generated is really strange with sudden change.

Edwardmark avatar Aug 28 '24 02:08 Edwardmark

When I change it to 16, or 18 bigger?or None?

henbucuoshanghai avatar Aug 28 '24 03:08 henbucuoshanghai

When I change it to 16, or 18 bigger?or None?

When I change it to 16, the video generated will change suddenly with low consistency.

Edwardmark avatar Aug 29 '24 01:08 Edwardmark

@zhengzangw could you give me a hint why we set vae micro-batch to 17? When I change it to 16, the video generated is really strange with sudden change.

The reason is we use a casual VAE. It will compress 17 frames to 5 frames. 17=1+4+4+4+4, 5=1+4. You should not use 16 or 18.

zhengzangw avatar Aug 29 '24 03:08 zhengzangw

@zhengzangw Thanks for your kind reply. I finally got it.

Edwardmark avatar Aug 29 '24 06:08 Edwardmark