Question

我正在尝试使用tf.contrib.rnn.ConvLSTMCell API和tf.nn.dynamic_rnn API在tensorflow（1.4）中构建seq2seq模型，但是输入的维度出错了。 / p>

我的代码是：

# features is an image sequence with shape [600, 400, 10], 
# so features is a tensor with shape [batch_size, 600, 400, 10]

features = tf.transpose(features, [0,3,1,2])
features = tf.reshape(features, [params['batch_size'],10,600,400])   

encoder_cell = tf.contrib.rnn.ConvLSTMCell(conv_ndims=2,
                                           input_shape=[600, 400,1],
                                           output_channels=5,
                                           kernel_shape=[7,7],
                                           skip_connection=False)

_, encoder_state = tf.nn.dynamic_rnn(cell=encoder_cell,
                                     inputs=features,
                                     sequence_length=[10]*params['batch_size'],
                                     dtype=tf.float32)

我收到以下错误

ValueError: Conv Linear expects all args to be of same Dimension: [[2, 600, 400], [2, 600, 400, 5]]

看一下tf的实现，似乎dynamic_rnn的输入只是3维，与隐藏状态相反，后者是4维。我试图将输入作为嵌套元组传递，但它不起作用。

这个问题类似于TensorFlow dynamic_rnn regressor: ValueError dimension mismatch，但它略有不同，因为它们使用的是普通的LSTMCell（对我有用）。

有人能给我一个最小的例子，说明如何将这两个API结合使用吗？谢谢！

Answer 1

据我所知https://github.com/iwyoo/ConvLSTMCell-tensorflow/issues/2 目前， tf.nn.dynamic_rnn 不支持 ConvLSTMCell 。

因此，如此处所述，https://github.com/iwyoo/ConvLSTMCell-tensorflow/issues/1您必须手动创建 RNN 。

文档https://github.com/iwyoo/ConvLSTMCell-tensorflow/blob/master/README.md

中提供了一个示例

下面我根据上面的示例修改了您的代码，并在必要时添加了注释。

height = 400
width = 400
time_steps = 25
channel = 10
batch_size = 2

p_input = tf.placeholder(tf.float32, [None, height, width, time_steps, channel])
p_label = tf.placeholder(tf.float32, [None, height, width, 3])

p_input_list = tf.split(p_input, step_size, 3) # creates a list of leghth time_steps and one elemnt has the shape of (?, 400, 400, 1, 10)
p_input_list = [tf.squeeze(p_input_, [3]) for p_input_ in p_input_list] #remove the third dimention now one list elemnt has the shape of (?, 400, 400, 10)

cell = tf.contrib.rnn.ConvLSTMCell(conv_ndims=2, # ConvLSTMCell definition
                                   input_shape=[height, width, channel],
                                   output_channels=5,
                                   kernel_shape=[7, 7],
                                   skip_connection=False)

state = cell.zero_state(batch_size, dtype=tf.float32) #initial state is zero

with tf.variable_scope("ConvLSTM") as scope:  # as BasicLSTMCell # create the RNN with a loop
    for i, p_input_ in enumerate(p_input_list):
        if i > 0:
            scope.reuse_variables()
        # ConvCell takes Tensor with size [batch_size, height, width, channel].
        t_output, state = cell(p_input_, state)

请注意，您必须输入具有相同高度和宽度的图像。如果高度和宽度不匹配，则可能需要填充。

希望这会有所帮助。

Answer 2

与此同时，我想出了如何将2个API结合使用。诀窍是将5D-Tensor作为输入传递给tf.nn.dynamic_rnn（），其中最后一个维度是空间网格上的＆＃34;向量的大小＆＃34; （它来自于从2D到3D的输入转换，受到实现所基于的论文的启发：https://arxiv.org/pdf/1506.04214.pdf）。在我的情况下，矢量大小为1，我不得不扩展维度。

在修复此错误的同时出现了另一个问题：在上面3.1节中提到的论文中，他们说明了convLSTM的等式。他们将Hadamard产品用于连接到单元输出C的重量C.在Tensorflow中打印我的ConvLSTMCell的重量，看起来他们根本不使用重量Wci，Wcf和Wco。那么，有人能告诉我TF ConvLSTMCell的确切实现吗？

顺便说一下。张量流ConvSTMCell的输出是C还是H（在论文的表示法中）？

ValueError：ConvLSTMCell和dynamic_rnn

2 个答案: