我一直在尝试在tensorflow中编写变分自动编码器(VAE)。我能够在[https://arxiv.org/abs/1312.6114]中实现具有高斯编码器网络和伯努利解码器的版本。
但是,我想使用实值数据,但我无法使用高斯解码器来实现VAE。我已经缩小了我的网络问题:我的网络似乎没有学习对角多变量高斯的参数。这是非常简单的测试用例的代码。我的输入数据只是从法线(0,1)中提取的。网络需要学习的是我的数据的均值和方差。我希望均值收敛到0,方差收敛到1.但它不会:
import tensorflow as tf
import numpy as np
tf.reset_default_graph()
input_dim = 1
hidden_dim = 10
learning_rate = 0.001
num_batches = 1000
# Network
x = tf.placeholder(tf.float32, (None, input_dim))
with tf.variable_scope('Decoder'):
h1 = tf.layers.dense(x, hidden_dim, activation=tf.nn.softplus, name='h1')
mu = tf.layers.dense(h1, input_dim, activation=tf.nn.softplus, name='mu')
diag_stdev = tf.layers.dense(h1, input_dim, activation=tf.nn.softplus, name='diag_stdev')
# Loss: -log(p(x))
with tf.variable_scope('Loss'):
dist = tf.contrib.distributions.MultivariateNormalDiag(loc=mu, scale_diag=diag_stdev)
loss = - tf.reduce_mean(tf.log(1e-10 + dist.prob(x)))
# Optimizer
train_step = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
summary_writer = tf.summary.FileWriter('./log_dir', tf.get_default_graph())
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
mu_plot = np.zeros(num_batches,)
for i in range(num_batches): # degenerate case batch_size of 1
input_ = np.random.multivariate_normal(mean=[0], cov=np.diag([1]), size=(1))
loss_ , mu_ , diag_stdev_ , _ = sess.run([loss, mu, diag_stdev, train_step],feed_dict={x: input_})
print("-p(x): {}, mu: {}, diag_stdev: {}".format(loss_, mu_,diag_stdev_))