ValueError:ConvLSTMCell和dynamic_rnn

时间:2017-11-23 15:48:51

标签: python tensorflow neural-network conv-neural-network rnn

我正在尝试使用tf.contrib.rnn.ConvLSTMCell API和tf.nn.dynamic_rnn API在tensorflow(1.4)中构建seq2seq模型,但是输入的维度出错了。 / p>

我的代码是:

# features is an image sequence with shape [600, 400, 10], 
# so features is a tensor with shape [batch_size, 600, 400, 10]

features = tf.transpose(features, [0,3,1,2])
features = tf.reshape(features, [params['batch_size'],10,600,400])   

encoder_cell = tf.contrib.rnn.ConvLSTMCell(conv_ndims=2,
                                           input_shape=[600, 400,1],
                                           output_channels=5,
                                           kernel_shape=[7,7],
                                           skip_connection=False)

_, encoder_state = tf.nn.dynamic_rnn(cell=encoder_cell,
                                     inputs=features,
                                     sequence_length=[10]*params['batch_size'],
                                     dtype=tf.float32)

我收到以下错误

ValueError: Conv Linear expects all args to be of same Dimension: [[2, 600, 400], [2, 600, 400, 5]]

看一下tf的实现,似乎dynamic_rnn的输入只是3维,与隐藏状态相反,后者是4维。我试图将输入作为嵌套元组传递,但它不起作用。

这个问题类似于TensorFlow dynamic_rnn regressor: ValueError dimension mismatch,但它略有不同,因为它们使用的是普通的LSTMCell(对我有用)。

有人能给我一个最小的例子,说明如何将这两个API结合使用吗? 谢谢!

2 个答案:

答案 0 :(得分:0)

据我所知https://github.com/iwyoo/ConvLSTMCell-tensorflow/issues/2 目前, tf.nn.dynamic_rnn 不支持 ConvLSTMCell

因此,如此处所述,https://github.com/iwyoo/ConvLSTMCell-tensorflow/issues/1您必须手动创建 RNN

文档https://github.com/iwyoo/ConvLSTMCell-tensorflow/blob/master/README.md

中提供了一个示例

下面我根据上面的示例修改了您的代码,并在必要时添加了注释。

height = 400
width = 400
time_steps = 25
channel = 10
batch_size = 2

p_input = tf.placeholder(tf.float32, [None, height, width, time_steps, channel])
p_label = tf.placeholder(tf.float32, [None, height, width, 3])

p_input_list = tf.split(p_input, step_size, 3) # creates a list of leghth time_steps and one elemnt has the shape of (?, 400, 400, 1, 10)
p_input_list = [tf.squeeze(p_input_, [3]) for p_input_ in p_input_list] #remove the third dimention now one list elemnt has the shape of (?, 400, 400, 10)

cell = tf.contrib.rnn.ConvLSTMCell(conv_ndims=2, # ConvLSTMCell definition
                                   input_shape=[height, width, channel],
                                   output_channels=5,
                                   kernel_shape=[7, 7],
                                   skip_connection=False)

state = cell.zero_state(batch_size, dtype=tf.float32) #initial state is zero

with tf.variable_scope("ConvLSTM") as scope:  # as BasicLSTMCell # create the RNN with a loop
    for i, p_input_ in enumerate(p_input_list):
        if i > 0:
            scope.reuse_variables()
        # ConvCell takes Tensor with size [batch_size, height, width, channel].
        t_output, state = cell(p_input_, state)

请注意,您必须输入具有相同高度宽度的图像。如果高度宽度不匹配,则可能需要填充。

希望这会有所帮助。

答案 1 :(得分:0)

与此同时,我想出了如何将2个API结合使用。诀窍是将5D-Tensor作为输入传递给tf.nn.dynamic_rnn(),其中最后一个维度是空间网格上的"向量的大小" (它来自于从2D到3D的输入转换,受到实现所基于的论文的启发:https://arxiv.org/pdf/1506.04214.pdf)。在我的情况下,矢量大小为1,我不得不扩展维度。

在修复此错误的同时出现了另一个问题:在上面3.1节中提到的论文中,他们说明了convLSTM的等式。他们将Hadamard产品用于连接到单元输出C的重量C.在Tensorflow中打印我的ConvLSTMCell的重量,看起来他们根本不使用重量Wci,Wcf和Wco。那么,有人能告诉我TF ConvLSTMCell的确切实现吗?

顺便说一下。张量流ConvSTMCell的输出是C还是H(在论文的表示法中)?