Question

我搜索过许多教程/博客/指南和官方Tensorflow文档，以了解这一点。例如，请参阅以下行：

lstm = tf.nn.rnn_cell.LSTMCell(512)
output, state_tuple = lstm(current_input, last_state_tuple)

现在如果我解压缩状态，

last_cell_memory, last_hidden_state =  state_tuple

output和last_hidden_state的维度与[batch_size，512]完全相同。两者都可以互换使用吗？我的意思是，我可以这样做吗？：

last_state_tuple= last_cell_memory, output

然后在lstm中输入last_state_tuple？

Answer 1

雅克的答案是正确的，但它没有提到重要的一点：LSTM层几乎的状态总是等于输出。当LSTM细胞链很长并且并非所有输入序列具有相等的长度（因此被填充）时，差异变得重要。那时你应该区分状态和输出。

请参阅my answer on a similar question中的可运行示例（它使用BasicRNNCell，但您将获得与LSTMCell相同的结果。）

Answer 2

是的，状态的第二个元素与输出相同。

来自https://www.tensorflow.org/api_docs/python/tf/contrib/rnn/LSTMStateTuple

按顺序存储两个元素：（c，h）。其中c是隐藏状态，h是输出。

还要通过实验验证：

import tensorflow as tf
from numpy import random as rng
lstm = tf.nn.rnn_cell.LSTMCell(10)
inp = tf.placeholder(tf.float32, shape=(1, 10))
stt = tf.placeholder(tf.float32, shape=(1, 10))
hdd = tf.placeholder(tf.float32, shape=(1, 10))
out = lstm(inp, (stt, hdd))
sess = tf.InteractiveSession()
init = tf.global_variables_initializer()
sess.run(init)
a = rng.randn(1, 10)
b = rng.randn(1, 10)
c = rng.randn(1, 10)
output = sess.run(out, {inp: a, stt: b, hdd: c})
assert (output[0] == output[1][1]).all()

在Tensorflow中，返回的＆＃39;输出＆＃39;之间有什么区别？并且＆＃39; h＆＃39; LSTMCell中的状态元组（c，h）？

2 个答案: