Question

我在这里引用代码https://github.com/martin-gorner/tensorflow-rnn-shakespeare/blob/master/rnn_train.py 我正在尝试将细胞从GRUCell转换为LSTMCell。以下是代码的摘录。

# input state
Hin = tf.placeholder(tf.float32, [None, INTERNALSIZE * NLAYERS], name='Hin')  # [ BATCHSIZE, INTERNALSIZE * NLAYERS]

# using a NLAYERS=3 layers of GRU cells, unrolled SEQLEN=30 times
# dynamic_rnn infers SEQLEN from the size of the inputs Xo

# How to properly apply dropout in RNNs: see README.md
cells = [rnn.GRUCell(INTERNALSIZE) for _ in range(NLAYERS)]

# "naive dropout" implementation
dropcells = [rnn.DropoutWrapper(cell, input_keep_prob=pkeep) for cell in cells]
multicell = rnn.MultiRNNCell(dropcells, state_is_tuple=False)
multicell = rnn.DropoutWrapper(multicell, output_keep_prob=pkeep)  # dropout for the softmax layer

Yr, H = tf.nn.dynamic_rnn(multicell, Xo, dtype=tf.float32, initial_state=Hin)
# Yr: [ BATCHSIZE, SEQLEN, INTERNALSIZE ]
# H:  [ BATCHSIZE, INTERNALSIZE*NLAYERS ] # this is the last state in the sequence

H = tf.identity(H, name='H')  # just to give it a name

我知道LSTMCell有两个状态，单元状态C和输出状态H.我想要做的是用两个状态的元组提供initial_state。我怎么能以正确的方式这样做？我尝试了各种方法，但始终遇到张量流错误。

编辑：这是尝试之一：

# inputs
X = tf.placeholder(tf.uint8, [None, None], name='X')  # [ BATCHSIZE, SEQLEN ]
Xo = tf.one_hot(X, ALPHASIZE, 1.0, 0.0)  # [ BATCHSIZE, SEQLEN, ALPHASIZE ]
# expected outputs = same sequence shifted by 1 since we are trying to predict the next character
Y_ = tf.placeholder(tf.uint8, [None, None], name='Y_')  # [ BATCHSIZE, SEQLEN ]
Yo_ = tf.one_hot(Y_, ALPHASIZE, 1.0, 0.0)  # [ BATCHSIZE, SEQLEN, ALPHASIZE ]
# input state
Hin = tf.placeholder(tf.float32, [None, INTERNALSIZE * NLAYERS], name='Hin')  # [ BATCHSIZE, INTERNALSIZE * NLAYERS]
Cin = tf.placeholder(tf.float32, [None, INTERNALSIZE * NLAYERS], name='Cin')
initial_state = tf.nn.rnn_cell.LSTMStateTuple(Cin, Hin)
# using a NLAYERS=3 layers of GRU cells, unrolled SEQLEN=30 times
# dynamic_rnn infers SEQLEN from the size of the inputs Xo

# How to properly apply dropout in RNNs: see README.md
cells = [rnn.LSTMCell(INTERNALSIZE) for _ in range(NLAYERS)]

# "naive dropout" implementation
dropcells = [rnn.DropoutWrapper(cell, input_keep_prob=pkeep) for cell in cells]
multicell = rnn.MultiRNNCell(dropcells, state_is_tuple=True)
multicell = rnn.DropoutWrapper(multicell, output_keep_prob=pkeep)  # dropout for the softmax layer

Yr, H = tf.nn.dynamic_rnn(multicell, Xo, dtype=tf.float32, initial_state=initial_state)

它说＆＃34; TypeError：＆＃39; Tensor＆＃39;对象不可迭代。＆＃34;

感谢。

Answer 1

错误正在发生，因为您在构建图表时必须为每个单独层提供元组（占位符），然后在您进行培训时必须提供第一层的状态。

错误在于：我需要遍历（c＆＃39; s和m＆＃39;）的元组的列表，因为你有多个单元格，我需要初始化他们所有的状态，但我所看到的只是一个Tensor，我无法对此进行迭代。

此代码段显示了在构建图表时如何设置占位符：

state_size = 10
num_layers = 3

X = tf.placeholder(tf.float32, [None, 100, 10])

# the second dimension is size 2 and represents
# c, m ( the cell and hidden state ) 
# set the batch_size to None
state_placeholder = tf.placeholder(tf.float32, [num_layers, 2, 
                                    None, state_size])
# l is number of layers placeholders 
l = tf.unstack(state_placeholder, axis=0)

then we create a tuple of LSTMStateTuple for each layer
rnn_tuple_state = tuple(
         [rnn.LSTMStateTuple(l[idx][0],l[idx][1])
          for idx in range(num_layers)]
)

# I had to set resuse = True here : tf.__version__ 1.7.0
cells  = [rnn.LSTMCell(10, reuse=True)] * num_layers
mc = rnn.MultiRNNCell(cells, state_is_tuple=True)

outputs, state = tf.nn.dynamic_rnn(cell=mc,
                                   inputs=X,
                                   initial_state=rnn_tuple_state,
                                   dtype=tf.float32)

以下是docs的相关位：

initial_state :(可选）RNN的初始状态。如果 cell.state_size是一个整数，这必须是适当的Tensor 类型和形状[batch_size，cell.state_size]。

因此，我们结束了为每个单元格（图层）创建一个具有必需大小的占位符元组。（batch_size，state_size）其中batch_size = None。我阐述了answer

向LSTMCell提供初始状态

1 个答案: