UPST-NeRF icon indicating copy to clipboard operation
UPST-NeRF copied to clipboard

How to get the consistency measurement?

Open kigane opened this issue 3 years ago • 5 comments

E(Oi, Oj) = LPIPS(Oi, Mi,j, Wi,j(Oj)), how to get the mask and how to apply it to lpips?

kigane avatar Oct 18 '22 07:10 kigane

E(Oi, Oj) = LPIPS(Oi, Mi,j, Wi,j(Oj)), how to get the mask and how to apply it to lpips?

We totally followed the link (https://github.com/phoenix104104/fast_blind_video_consistency) to calculate lpips, please find the detail in there. thanks

semchan avatar Oct 19 '22 02:10 semchan

I have read their code. in evaluate_LPIPS.py they use LPIPS to get the perceptual distance between processed image P and their model output O. But, P and O are the same frame of the video. in evaludate_WarpError.py, they use optical flow predicted by FlowNet2 betweent frame1 and frame2 to warp frame2 to frame1, then calculate the L2 distance on non-occlude pixels. They do not use masks on LPIPS metric. As far as I know, LPIPS use vgg/squeeze/alex net to extract feature maps of differen layers of two input images, then calculate the L2 distance. So I am really confused about the mask Mi,j used in the equation. Could you please explain this detail more clearly? thank you.

kigane avatar Oct 20 '22 02:10 kigane

@kigane Have you solved this problem? It strange that none of StylizedNeRF, StyleRF, Learning to Stylize Novel Views, etc. provide a calculation method of consistency.

ZijiangY1116 avatar May 10 '23 09:05 ZijiangY1116

@kigane Have you solved this problem? It strange that none of StylizedNeRF, StyleRF, Learning to Stylize Novel Views, etc. provide a calculation method of consistency.

I have the same doubt as well. Why hasn't the calculation method for quantitative indicators been provided, even though it's the only evaluation criterion?

zAuk000 avatar Jun 07 '23 14:06 zAuk000

I have read their code. in evaluate_LPIPS.py they use LPIPS to get the perceptual distance between processed image P and their model output O. But, P and O are the same frame of the video. in evaludate_WarpError.py, they use optical flow predicted by FlowNet2 betweent frame1 and frame2 to warp frame2 to frame1, then calculate the L2 distance on non-occlude pixels. They do not use masks on LPIPS metric. As far as I know, LPIPS use vgg/squeeze/alex net to extract feature maps of differen layers of two input images, then calculate the L2 distance. So I am really confused about the mask Mi,j used in the equation. Could you please explain this detail more clearly? thank you.

Have you tried testing the generated results using the code from "warperror.py"? If so, are the results close to those in the paper?

zAuk000 avatar Jun 07 '23 14:06 zAuk000