Question

我正在实现一个自动编码器设置，其中编码器和解码器均为递归神经网络（RNN）。编码器和解码器模型的初始化如下：

def enc(message, weights, biases):
    message = tf.unstack(message, 4, 1)
    fw_cell = rnn.LSTMCell(num_hidden_enc)
    with tf.variable_scope("encoder"):
        outputs, _ = rnn.static_rnn(fw_cell, message, dtype=tf.float32)
    return tf.matmul(outputs[-1], weights) + biases

def dec(codeword, weights, biases, time_steps):
    codeword = tf.expand_dims(codeword, axis=2)
    codeword = tf.unstack(codeword, 7, 1)
    fw_cell = rnn.LSTMCell(num_hidden_dec)
    with tf.variable_scope("decoder"):
        outputs, _ = rnn.static_rnn(fw_cell, codeword, dtype=tf.float32)
    a = tf.matmul(outputs[-1], weights) + biases
    weight_fc = np.random.normal(loc=0.0, scale=0.01, size=[4, 4])
    init = tf.constant_initializer(weight_fc)
    return tf.layers.dense(a, units=4, activation=tf.nn.sigmoid, kernel_initializer=init)

我正在使用mean square error损失函数和Adam优化器。

# message_hat is the output of the decoder neural network, input_bits is the input to the encoder network
loss = tf.reduce_sum(0.5 * (tf.squeeze(input_bits) - message_hat) ** 2) / float(batch_size)
opt = tf.train.AdamOptimizer().minimize(loss)

当我在时间段0的末尾打印解码器网络的输出时，我得到了一些浮点值，但是当我再运行一个时间段时，所有值都默认为NaN。编码器和解码器均是如此。所以我尝试使用以下代码进行梯度裁剪：

opt = tf.train.AdamOptimizer()
gvs = opt.compute_gradients(loss)
capped_gvs = [(tf.clip_by_value(grad, -1., 1.), var) for grad, var in gvs]
train_op = opt.apply_gradients(capped_gvs)

仍然有同样的问题。更准确地说，这是从纪元1开始的输出：

Epoch: 1
Decoding loss: nan
Decoder output: [[nan nan nan nan nan nan nan]
 [nan nan nan nan nan nan nan]
 [nan nan nan nan nan nan nan]
 [nan nan nan nan nan nan nan]
 [nan nan nan nan nan nan nan]
 [nan nan nan nan nan nan nan]
 [nan nan nan nan nan nan nan]
 [nan nan nan nan nan nan nan]
 [nan nan nan nan nan nan nan]
 [nan nan nan nan nan nan nan]]

代码内在有问题吗？欢迎提出建议。预先感谢。

体重更新后RNN输出NaN

0 个答案: