ganhacks for GAN ，why my D loss is increse，and G loss decrease to 0 at the begining

The generated picture is noise. step: 4650,G_loss_adv: 0.325, G_accuracy: 0.984,
D_loss_adv: 0.982, d_loss_pos: 0.598, d_loss_neg: 1.366,
D_accuracy: 0.258, d_pos_acc: 0.500, d_neg_acc: 0.016 my G_loss less than D_loss and generated samples score significantly higher than the real picture, D is completely abnormal (normal D_loss is small, D can distinguish true and false it?), my D structure using four convs + fullly connected, I do not know Why do you make a mistake?

Jan 12 '18 13:01 hefeiwangyande

Please fix your message, it is not readable.

And I doubt anyone will spend hours trying to debug your code, please come with a precise question.

Jan 12 '18 13:01 DEKHTIARJonathan

    G_loss_adv = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(
            logits=d_fake_logit, labels=tf.ones_like(d_fake_logit)), name='g_loss')

    d_loss_pos = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(
            logits=d_real_logit, labels=tf.ones_like(d_real_logit)), name='d_loss_real')
    d_loss_neg = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(        
     logits=d_fake_logit, labels=tf.zeros_like(d_fake_logit)), name='d_loss_fake')
    D_loss_adv = tf.add(.5 * d_loss_pos, .5 * d_loss_neg, name='d_loss')

    # about accuracy
    d_pos_acc = tf.reduce_mean(tf.cast(score_real > 0.5, tf.float32), name='accuracy_real')
    d_neg_acc = tf.reduce_mean(tf.cast(score_fake < 0.5, tf.float32), name='accuracy_fake')
    d_accuracy =tf.add(.5 * d_pos_acc, .5 * d_neg_acc, name='accuracy')

    g_accuracy = tf.reduce_mean(tf.cast(score_fake > 0.5, tf.float32), name='accuracy')

Jan 13 '18 01:01 hefeiwangyande

In your implementation is looks like d_loss_fake should be different from g_loss_adv.

Assuming that : 1) G is the Generator outputting a fake image from a noise vector z 2) D is the discriminator and that it outputs the probability that the input is real:

one gets: g_loss_adv = D(G(z)) and d_loss_fake = 1 - D(G(z))

Jan 13 '18 01:01 rafaelvalle

@DEKHTIARJonathan Thanks for your suggestion, I have changed the information.

Jan 13 '18 02:01 hefeiwangyande

@rafaelvalle Your suggestion is： d_loss_neg = tf.reduce_mean(1-tf.nn.sigmoid_cross_entropy_with_logits( logits=d_fake_logit, labels=tf.zeros_like(d_fake_logit)), name='d_loss_fake')

Actually, d_fake_logit=D(G(z)) in my implemented, Input noise z through by D, its value should be relatively small close to 0, so I think d_loss_neg is not wrong or my understanding wrong?

Jan 13 '18 02:01 hefeiwangyande

Let's look at the positive (real) and negative (adversarial) losses one by one. Consider D outputs the probability of the input being real.

A) if d_loss_pos is minimized using D(x) and labels for x are 1, D minimizes its loss by trying to make D(x) closer to 1. B) if d_loss_neg is minimized D(G(z)) and labels for labels for G(z) are 0, D minimizes its loss by trying to make (D(G(z)) closer to 0.

Your problem could be that 1) the labels for x and G(z) are the same instead of 1 and 0 respectively. 2) If that's not the problem, it could be that using D(G(z)) has vanishing gradients early on and people prefer to use 1 - D(G(z)).

Now let's assume you have 1 and 2 correct, where else could the problem be? Look that in your code below the generator and the discriminator have the same function to minimize. This is not correct as they should minimize different loss functions. That's why I suggested chancing g_loss_adv to 1 - d_fake_logit

    g_loss_adv = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(
            logits=d_fake_logit, labels=tf.ones_like(d_fake_logit)), name='g_loss')
    d_loss_neg = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(        
     logits=d_fake_logit, labels=tf.zeros_like(d_fake_logit)), name='d_loss_fake')

Jan 15 '18 16:01 rafaelvalle

@rafaelvalle I'm a little bit understand your meaning, you mean that my D (G (Z)) is too large to reduce may be due to the disappearance of the gradient, so choose to maximize the 1-D (G (Z))?

In addition, d_loss_neg, g_loss_adv parameters are not exactly the same g_loss_adv = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits( logits=d_fake_logit, labels=tf.ones_like(d_fake_logit)), name='g_loss') d_loss_neg = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(
logits=d_fake_logit, labels=tf.zeros_like(d_fake_logit)), name='d_loss_fake')

Jan 16 '18 02:01 hefeiwangyande

Oh, I missed the ones_like, zeros_like! Sorry for not reading carefully. There are many things that could be the reason :

Loss function Try using 1 - D(G(z)) instead
Time Wait for a few iterations until training converges to a specific behavior, for example generator always wins.
Learning rates Try adjusting such that the loosing part has higher learning rate
Discriminator number of iterations vs Generator number of iterations Try adjusting such that the loosing part has more iterations
Weight initialization Try Xavier Uniform with gain set according to non-linearity
Noise vector Try using uniform noise instead of normal noise
Model capacity Try increasing the loosing part's capacity

Report here if you find what the problem was such that we all learn.

Jan 16 '18 16:01 rafaelvalle

I can also use: Label Smoothing for the discriminator to weaken it.

I have submit a PR request on TF to help you implementing this feature: https://github.com/tensorflow/tensorflow/pull/16153

You can get inspiration to write your own custom code

Jan 17 '18 08:01 DEKHTIARJonathan

I encounter the same issue, and finally found that I ignore to use ONLY the gradient of each discriminator part or generator part. If you don't add params like var_list=generator_vars, the optimizer will weaken discriminator's ability by update it's parameters.

discriminator_vars =  [var for var in tf.global_variables() if  "discriminator" in var.name]
generator_vars =  [var for var in tf.global_variables() if  "generator" in var.name]

self.D_optimizer = tf.train.AdamOptimizer(learning_rate=2e-4).minimize(self.D_loss, var_list=discriminator_vars)
self.G_optimizer = tf.train.AdamOptimizer(learning_rate=2e-4).minimize(self.G_loss, var_list=generator_vars)

Sep 12 '19 06:09 danielkaifeng