faceswap-model icon indicating copy to clipboard operation
faceswap-model copied to clipboard

How does the loss minimize if autoencoder_B keeps changing the weights learned by autoencoder_A ?

Open mike3454 opened this issue 6 years ago • 2 comments

The training of the single encoder and 2 decoders happens like in the following simplified code.

self.encoder = self.Encoder()
self.decoder_A = self.Decoder()
self.decoder_B = self.Decoder()
...
self.autoencoder_A = KerasModel(x, self.decoder_A(self.encoder(x)))
self.autoencoder_B = KerasModel(x, self.decoder_B(self.encoder(x)))

for i in epochs:
    self.autoencoder_A.train_one_batch(...) 
    self.autoencoder_B.train_one_batch(...) # doesn't this reset the encoder weights 

My understanding when training one autoencoder_A it does not change the weights of autoencoder_B's decoder but changes the weights of the encoder since it is shared. Please correct me if i am wrong.

How does the loss gets minimized if one autoencoder changes the weight of shared encoder another alternatively ?

mike3454 avatar Feb 20 '19 16:02 mike3454

Over time, the gradients will descend toward a shared minimum. This works because the encoder is meant to generalize to faces (though really only generalizes to two faces).

bryanlyon avatar Feb 20 '19 17:02 bryanlyon

Not convincing. The model looks like this

input face A -> encoder -> base vector A -> decoder A -> output face which resembles A (autoencoder_A)
input face B -> encoder -> base vector B -> decoder B -> output face which resembles B (autoencoder_B)

Now while training on the autoencoder_A it does not affect the decoder B but changes the weights of decoder A and shared encoder and same happens for autoencoder_B. In such case it's like changing the weights of encoder alternatively there by never converging. Right ?

mike3454 avatar Feb 20 '19 17:02 mike3454