Face2Diffusion icon indicating copy to clipboard operation
Face2Diffusion copied to clipboard

Why 112x112?

Open yaseryacoob opened this issue 1 year ago • 5 comments

In reading the paper and the repo I am wondering why you chose 112x112 as the face resolution? Given other embeddings are in the 200+ and given that one could go to 512x512 to capture Identity even better.

thanks for sharing your code.

yaseryacoob avatar Mar 17 '24 19:03 yaseryacoob

Thank you for your interest in our work. This is because the face recognition (identity encoding) part of our code is based on insightface that uses 112x112 images. In practice, 112x112 images work well in recognizing (identifying) faces.

mapooon avatar Mar 17 '24 23:03 mapooon

The issue is that face recognition type embeddings are (or maybe) good for face recognition, but more is needed for generative type purposes. This is why you had to have the expression encoding. I wonder if a richer embedding can be computed. It is an open question, beyond your work. Thanks for sharing your work!

yaseryacoob avatar Mar 17 '24 23:03 yaseryacoob

Exactly. 112x112 images are considered sufficient for face recognition, but not obvious for generative purposes. Higher resolution may yield better results for generative models as you mentioned.

mapooon avatar Mar 17 '24 23:03 mapooon

Thank you for your excellent work. Could you please share your training process readme and training time,datasets?

gaoyixuan111 avatar Mar 19 '24 10:03 gaoyixuan111

Thank you for your excellent work. Could you please share your training process readme and training time,datasets?

请问您有得到相关代码吗

Andyplus1 avatar Apr 07 '24 06:04 Andyplus1