因此,我着手从kaggle的猫与狗数据集中训练gan生成猫/狗照片。经过超参数调整后,我的所有力量仍然在输出中得到噪音。 这是我做的:
1)我将图像转换为64 * 64
2)将它们转换为原型缓冲区
3)将其按如下所示推入图形:
def input_pipe(filenames,batch_size=32):
dataset=tf.data.TFRecordDataset(filenames=filenames)
dataset=dataset.map(parse,num_parallel_calls=64)
dataset=dataset.shuffle(25000)
dataset=dataset.batch(batch_size)
dataset=dataset.prefetch(16)
return dataset
4)声明了生成器函数,例如:
def generator(z, out_channel_dim,keep_prob,is_train=True, alpha=0.2):
with tf.variable_scope('generator'):
fc = tf.layers.dense(z, 7*7*128, use_bias=False)
fc = tf.reshape(fc, (-1, 7, 7, 128))
bn0 = tf.layers.batch_normalization(fc)
lrelu0 = tf.maximum(alpha * bn0, bn0)
conv1 = tf.layers.conv2d_transpose(lrelu0, 512, 2, 1, 'valid', use_bias=False)
bn1 = tf.layers.batch_normalization(conv1)
lrelu1 = tf.maximum(alpha * bn1, bn1)
conv2 = tf.layers.conv2d_transpose(lrelu1, 256, 3, 2, 'same', use_bias=False)
bn2 = tf.layers.batch_normalization(conv2)
lrelu2 = tf.maximum(alpha * bn2, bn2)
conv3 = tf.layers.conv2d_transpose(lrelu2, 128, 3, 2, 'same')
bn3 = tf.layers.batch_normalization(conv3)
lrelu3 = tf.maximum(alpha * bn3, bn3)
logits = tf.layers.conv2d_transpose(lrelu3, 3, 3, 2, 'same')
out = tf.tanh(logits,name='image')
return out
5)然后是鉴别符:
def discriminator(images, reuse=False, alpha=0.2):
with tf.variable_scope('discriminator', reuse=reuse):
conv1 = tf.layers.conv2d(images, 128, 4, 2, padding='same')
lrelu1 = tf.maximum(alpha * conv1, conv1)
conv2 = tf.layers.conv2d(lrelu1, 256, 3, 1, 'valid')
bn2 = tf.layers.batch_normalization(conv2)
lrelu2 = tf.maximum(alpha * bn2, bn2)
conv3 = tf.layers.conv2d(lrelu2,256, 2, 1, 'valid')
bn3 = tf.layers.batch_normalization(conv3)
lrelu3 = tf.maximum(alpha * bn3, bn3)
flat = tf.layers.flatten(lrelu3)
logits = tf.layers.dense(flat, 1)
out = tf.sigmoid(logits)
return out,logits
6)损失函数:
def model_loss(input_real, input_z, out_channel_dim,keep_prob, alpha=0.2, smooth_factor=0.1):
with tf.name_scope("loss"):
d_model_real, d_logits_real = discriminator(input_real, alpha=alpha)
d_loss_real = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=d_logits_real,labels=tf.ones_like(d_model_real) * (1 - smooth_factor)))
input_fake = generator(input_z, out_channel_dim, keep_prob, alpha=alpha)
d_model_fake, d_logits_fake = discriminator(input_fake, reuse=True, alpha=alpha)
d_loss_fake = tf.reduce_mean(
tf.nn.sigmoid_cross_entropy_with_logits(logits=d_logits_fake, labels=tf.zeros_like(d_model_fake)))
g_loss = tf.reduce_mean(
tf.nn.sigmoid_cross_entropy_with_logits(logits=d_logits_fake, labels=tf.ones_like(d_model_fake)),name='g_loss')
final_d_loss=tf.add(d_loss_real,d_loss_fake,name='d_loss')
tf.summary.scalar("d_loss",final_d_loss)
tf.summary.scalar("g_loss",g_loss)
merged=tf.summary.merge_all()
return final_d_loss, g_loss,input_fake,merged
训练后的损失如下: Discriminator Loss Generator Loss
我已经在input_real_image中尝试了batch_norm,进行了辍学,更改了生成器和分频器的学习率,只有噪声才有意义。:GAN Output
注意:鉴别器的损失不会为零,以防您希望梯度消失,它大约为0.5