我在这里引用代码https://github.com/martin-gorner/tensorflow-rnn-shakespeare/blob/master/rnn_train.py 我正在尝试将细胞从GRUCell转换为LSTMCell。以下是代码的摘录。
# input state
Hin = tf.placeholder(tf.float32, [None, INTERNALSIZE * NLAYERS], name='Hin') # [ BATCHSIZE, INTERNALSIZE * NLAYERS]
# using a NLAYERS=3 layers of GRU cells, unrolled SEQLEN=30 times
# dynamic_rnn infers SEQLEN from the size of the inputs Xo
# How to properly apply dropout in RNNs: see README.md
cells = [rnn.GRUCell(INTERNALSIZE) for _ in range(NLAYERS)]
# "naive dropout" implementation
dropcells = [rnn.DropoutWrapper(cell, input_keep_prob=pkeep) for cell in cells]
multicell = rnn.MultiRNNCell(dropcells, state_is_tuple=False)
multicell = rnn.DropoutWrapper(multicell, output_keep_prob=pkeep) # dropout for the softmax layer
Yr, H = tf.nn.dynamic_rnn(multicell, Xo, dtype=tf.float32, initial_state=Hin)
# Yr: [ BATCHSIZE, SEQLEN, INTERNALSIZE ]
# H: [ BATCHSIZE, INTERNALSIZE*NLAYERS ] # this is the last state in the sequence
H = tf.identity(H, name='H') # just to give it a name
我知道LSTMCell有两个状态,单元状态C和输出状态H.我想要做的是用两个状态的元组提供initial_state。 我怎么能以正确的方式这样做?我尝试了各种方法,但始终遇到张量流错误。
编辑:这是尝试之一:
# inputs
X = tf.placeholder(tf.uint8, [None, None], name='X') # [ BATCHSIZE, SEQLEN ]
Xo = tf.one_hot(X, ALPHASIZE, 1.0, 0.0) # [ BATCHSIZE, SEQLEN, ALPHASIZE ]
# expected outputs = same sequence shifted by 1 since we are trying to predict the next character
Y_ = tf.placeholder(tf.uint8, [None, None], name='Y_') # [ BATCHSIZE, SEQLEN ]
Yo_ = tf.one_hot(Y_, ALPHASIZE, 1.0, 0.0) # [ BATCHSIZE, SEQLEN, ALPHASIZE ]
# input state
Hin = tf.placeholder(tf.float32, [None, INTERNALSIZE * NLAYERS], name='Hin') # [ BATCHSIZE, INTERNALSIZE * NLAYERS]
Cin = tf.placeholder(tf.float32, [None, INTERNALSIZE * NLAYERS], name='Cin')
initial_state = tf.nn.rnn_cell.LSTMStateTuple(Cin, Hin)
# using a NLAYERS=3 layers of GRU cells, unrolled SEQLEN=30 times
# dynamic_rnn infers SEQLEN from the size of the inputs Xo
# How to properly apply dropout in RNNs: see README.md
cells = [rnn.LSTMCell(INTERNALSIZE) for _ in range(NLAYERS)]
# "naive dropout" implementation
dropcells = [rnn.DropoutWrapper(cell, input_keep_prob=pkeep) for cell in cells]
multicell = rnn.MultiRNNCell(dropcells, state_is_tuple=True)
multicell = rnn.DropoutWrapper(multicell, output_keep_prob=pkeep) # dropout for the softmax layer
Yr, H = tf.nn.dynamic_rnn(multicell, Xo, dtype=tf.float32, initial_state=initial_state)
它说" TypeError:' Tensor'对象不可迭代。"
感谢。
答案 0 :(得分:1)
错误正在发生,因为您在构建图表时必须为每个单独层提供元组(占位符),然后在您进行培训时必须提供第一层的状态。
错误在于:我需要遍历(c' s和m')的元组的列表,因为你有多个单元格,我需要初始化他们所有的状态,但我所看到的只是一个Tensor,我无法对此进行迭代。
此代码段显示了在构建图表时如何设置占位符:
state_size = 10
num_layers = 3
X = tf.placeholder(tf.float32, [None, 100, 10])
# the second dimension is size 2 and represents
# c, m ( the cell and hidden state )
# set the batch_size to None
state_placeholder = tf.placeholder(tf.float32, [num_layers, 2,
None, state_size])
# l is number of layers placeholders
l = tf.unstack(state_placeholder, axis=0)
then we create a tuple of LSTMStateTuple for each layer
rnn_tuple_state = tuple(
[rnn.LSTMStateTuple(l[idx][0],l[idx][1])
for idx in range(num_layers)]
)
# I had to set resuse = True here : tf.__version__ 1.7.0
cells = [rnn.LSTMCell(10, reuse=True)] * num_layers
mc = rnn.MultiRNNCell(cells, state_is_tuple=True)
outputs, state = tf.nn.dynamic_rnn(cell=mc,
inputs=X,
initial_state=rnn_tuple_state,
dtype=tf.float32)
以下是docs的相关位:
initial_state :(可选)RNN的初始状态。如果 cell.state_size是一个整数,这必须是适当的Tensor 类型和形状[batch_size,cell.state_size]。
因此,我们结束了为每个单元格(图层)创建一个具有必需大小的占位符元组。 (batch_size,state_size)其中batch_size = None。 我阐述了answer