Wang Duomin issues

Results 7 issues of


                                            Wang Duomin

Why gradients can not be propagated backward to the input proposal coordinates in the RoI pooling layer?

just like what i ask in title, in paper author said "gradients can be propagated backward from the output features to the input features, but not to the input proposal...

about codebook

so in the whole architecture, no explicit codebook vecters are used right? only categorical logits as the input to the decoder when you train your dvae?

About uv_weight_mask

Hi, Thanks for your awsome work! I am now confused in the generation of uv_weight_mask. Did you generate it using uv_kpt_ind.txt? And is uv_kpt_in.txt generated from Model_keypoints.mat? How can i...

why some end second are negative value?

just like the picture shows, some end seconds are negative, is there something wrong? besides i found some annotations has the same start_sec and end_sec, maybe it's also a mistake

codebook loss

does the beta weighted wrong loss term of embedding loss which should be commitment loss in the vanilla va-vae?

about rasterization for T_uv

Hi, this is a good work! Can you explain the process of rasteriazation for T_uv in more detail?

![20240723-231619](https://github.com/user-attachments/assets/6eae253f-3251-41d2-b67a-66f8879e9a99) 如图所示，直接使用3d vae重建sora的example，会发现结果是64*64的patch组成的，重建512*512的视频会有8*8个patch，1024*1024的视频会有16*16个patch。我找遍了code也没有发现哪里有patch的构建，64*64的patch对应到latent上应该是8*8个latent为一组进行处理，可代码中并没有这个操作。