Question

我想在tensorflow中构建lrcn，这基本上是图像中的seq2seq模型。就像典型的语言模型一样，除了必须通过CNN将图像转换为image_embedding。

问题在于，当我读取图像张量的形状序列[seq_length，batch，w，h，c]时，我想用编码器（例如AlexNet）对每个张量进行编码。（我解压缩张量以得到尺寸为seq_length的张量列表）

我尝试使用tf.while_loop来迭代张量列表但是如果在tf.while_loop中使用切片，似乎渐变计算不起作用。有一些讨论here。

有没有办法循环张量列表并用可学习的网络编码每个张量（[b，w，h，c]）？我尝试了几种方法，但梯度计算似乎是错误的。我认为必须有一个简单的方法来做到这一点，但我只是Tensorflow的新手，任何帮助或示例都将受到赞赏。

我试过的while循环方法：

  input_array = tensor_array_ops.TensorArray(
            dtype=tf.float32, size=sequence_length,
            dynamic_size=False, infer_shape=True)
  encoded_array = tensor_array_ops.TensorArray(
            dtype=tf.float32, size=sequence_length,
            dynamic_size=False, infer_shape=True)
  decoded_array = tensor_array_ops.TensorArray(
            dtype=tf.float32, size=sequence_length,
            dynamic_size=False, infer_shape=True)

  input_array = input_array.unpack(inputT)

  def encoder_body(index, inputs, outputs):
    x = inputs.read(index)
    outputs = outputs.write(index, encoder(x, phase_train))
    return index + 1, inputs, outputs

  def decoder_body(index, inputs, outputs):
    x = inputs.read(index)
    outputs = outputs.write(index, decoder(x, phase_train, batch_size))
    return index + 1, inputs, outputs

  _, _, encoded_array = tf.while_loop(lambda i, j, k: i<sequence_length,
                                        encoder_body, (tf.constant(0), input_array, encoded_array))
  _, _, decoded_array = tf.while_loop(lambda i, j, k: i<sequence_length,
                                        decoder_body, (tf.constant(0), encoded_array, decoded_array))

Tensorflow图像序列到序列

0 个答案: