In cs20si, the implementation of autoencoder can not converge.
Just tried the deconv version of autoencoder implementation in the cs20si folder. And I found it cannot converge. I upgrade the source code to support tensorflow 1.0 before the experiment. Below is the log from terminal:
Dataset size: 55000 Num iters: 27500 Loss at step 0 : 257.92 Loss at step 1000 : 76.21 Loss at step 2000 : 76.8853 Loss at step 3000 : 72.7826 Loss at step 4000 : 78.2417 Loss at step 5000 : 73.5233 Loss at step 6000 : 74.2761 Loss at step 7000 : 73.3959 Loss at step 8000 : 73.17 Loss at step 9000 : 71.3167 Loss at step 10000 : 71.9886 Loss at step 11000 : 72.9754 Loss at step 12000 : 70.0283 Loss at step 13000 : 73.1442 Loss at step 14000 : 73.8517 Loss at step 15000 : 76.1486 Loss at step 16000 : 73.104 Loss at step 17000 : 70.041 Loss at step 18000 : 74.5712 Loss at step 19000 : 75.6608 Loss at step 20000 : 76.3306 Loss at step 21000 : 74.3838 Loss at step 22000 : 72.5587 Loss at step 23000 : 78.1425 Loss at step 24000 : 70.3776 Loss at step 25000 : 75.8254 Loss at step 26000 : 74.9392
What is the problem here? Is there anyone who runs into the same issue?
Yes, I have the same problem. How can we fix the problem?
I have solved the problem by modifying the decoder function as below and set lr to 0.001. def decoder(input): fc_dec = fc(input, 'fc_dec', 72) fc_dec_reshaped = tf.reshape(fc_dec, [-1, 3, 3, 8]) deconv1 = deconv(fc_dec_reshaped, 'deconv1', [3, 3, 8], [2, 2]) deconv2 = deconv(deconv1, 'deconv2', [3, 3, 8], [2, 2]) deconv3 = deconv(deconv2, 'deconv3', [3, 3, 8], [2, 2]) deconv4 = deconv(deconv3, 'deconv4', [5, 5, 1], [1, 1], padding='VALID', non_linear_fn=tf.sigmoid) return deconv4
@guome I've tried your solution, work like a charm. But there something wired, the original model sometimes converge. I think the main reason maybe the there are too many nodes in the FC layer.
The original autoencoder seems too shallow (for channel), It works by deeper it c.f. deeplearning udacity
def encoder(my_input):
# Create a conv network with 3 conv layers and 1 FC layer
# Conv 1: filter: [3, 3, 1], stride: [2, 2], relu
conv1 = conv(my_input, 'conv1', [3, 3, 32], [2, 2], padding='SAME',
non_linear_fn=tf.nn.relu)
# Conv 2: filter: [3, 3, 8], stride: [2, 2], relu
conv2 = conv(conv1, 'conv2', [3, 3, 32], [2, 2], padding='SAME',
non_linear_fn=tf.nn.relu)
# Conv 3: filter: [3, 3, 8], stride: [2, 2], relu
conv3 = conv(conv2, 'conv3', [3, 3, 16], [2, 2], padding='SAME',
non_linear_fn=tf.nn.relu)
# FC: output_dim: 100, no non-linearity
fc_encoder = fc(conv3, 'fc_encoder', 128, non_linear_fn=None)
#raise NotImplementedError
return fc_encoder
def decoder(my_input):
# Create a deconv network with 1 FC layer and 3 deconv layers
# FC: output dim: 128, relu
fc_decoder = fc(my_input, 'fc_decoder', 144, non_linear_fn=tf.nn.relu)
# Reshape to [batch_size, 4, 4, 8]
reshape = tf.reshape(fc_decoder, [-1, 3, 3,16])
# Deconv 1: filter: [3, 3, 8], stride: [2, 2], relu
deconv1 = deconv(reshape, 'deconv1', [3, 3, 32], [2, 2], padding='VALID',
non_linear_fn=tf.nn.relu)
# Deconv 2: filter: [8, 8, 1], stride: [2, 2], padding: valid, relu
deconv2 = deconv(deconv1, 'deconv2', [3, 3, 32], [2, 2], padding='SAME',
non_linear_fn=tf.nn.relu)
# Deconv 3: filter: [7, 7, 1], stride: [1, 1], padding: valid, sigmoid
deconv3 = deconv(deconv2, 'deconv3', [3, 3, 1], [2, 2], padding='SAME',
non_linear_fn=tf.nn.sigmoid)
#raise NotImplementedError
return deconv3
Extracting ./data/MNIST_data\train-images-idx3-ubyte.gz Extracting ./data/MNIST_data\train-labels-idx1-ubyte.gz Extracting ./data/MNIST_data\t10k-images-idx3-ubyte.gz Extracting ./data/MNIST_data\t10k-labels-idx1-ubyte.gz Dataset size: 55000 Num iters: 10742 Loss at step 0 : 181.357 Loss at step 1000 : 81.3057 Loss at step 2000 : 20.6 Loss at step 3000 : 11.7914 Loss at step 4000 : 9.47117 Loss at step 5000 : 8.79411 Loss at step 6000 : 7.61038 Loss at step 7000 : 6.79378 Loss at step 8000 : 6.07108 Loss at step 9000 : 5.87751 Loss at step 10000 : 5.48986