graphics CvxNet: Can an autoencoder reconstruct an image exactly?

Hi! Thank you for your work!

In the paper I see that you show examples where 2D b/w images are used as the autoencoder input (Figure 3, page 4), so as the output you have object exactly reconstructed by the half-planes (lines in 2D space).

Tell me, please, whether such experiments were carried out? Does the network actually restore the shape exactly?

I am interested in this for the reason that I decided to conduct similar experiments for my own purposes (with 2d b/w images with oriented rectangles along the coordinate axes, one rectangle per image), but I can't train the network. For one image, the network is trained well (and work on this example well), but if I train it using multiple images, the predictions (even on the training sample) in most cases are very bad at restoring the shape of rectangles. It seems like it should work.

Some parameters I use: n_half_planes = 15 # with a margin, although the rectangle has only 4 sides n_parts = 1 # one rectangle per image dims = 2 # b/w images lr = 1e-4 latent_size = 10 Also, I use more light backbone.

Jan 14 '21 14:01 mashaeidlina

Tell me, please, whether such experiments were carried out? Does the network actually restore the shape exactly? Yes, the teardrop image can be restored in overfitting mode.

Mar 05 '21 03:03 taiya

Yes, the teardrop image can be restored in overfitting mode.

Ok, so you trained model for few (or even one) images in overfitting mode and visualized results on training sample, didn't you? If I understand everything correctly, then to me the reasoning about what exactly the autoencoder does inside itself seems unconvincing.

Mar 05 '21 11:03 mashaeidlina

Yes, this is just an illustration image (we didn't have a dataset of fonts at hands). It's a one off and we fiddled with initialization until overfitting the model to it give something visually pleasant.

unconvincing? if you say so.

Mar 05 '21 19:03 taiya