Question

我正在尝试将序列到序列建模应用于EEG数据。编码工作得很好，但是事实证明解码工作是有问题的。输入数据的形状为None×3000×31，其中第二维是序列长度。

编码器如下：

initial_state = lstm_sequence_encoder.zero_state(batchsize, dtype=self.model_precision)

encoder_output, state = dynamic_rnn(
     cell=LSTMCell(32),
     inputs=lstm_input, # shape=(None,3000,32)
     initial_state=initial_state, # zeroes
     dtype=lstm_input.dtype # tf.float32
)

我将RNN的最终状态用作解码器的初始状态。为了进行培训，我使用了TrainingHelper：

training_helper = TrainingHelper(target_input, [self.sequence_length])
training_decoder = BasicDecoder(
     cell=lstm_sequence_decoder,
     helper=training_helper,
     initial_state=thought_vector
)
output, _, _ = dynamic_decode(
     decoder=training_decoder,
     maximum_iterations=3000
)

当我尝试实现推理时，我的麻烦开始了。由于我使用的是非句子数据，因此不需要进行标记化或嵌入，因为数据实际上已经被嵌入。 InferenceHelper类似乎是实现我的目标的最佳方法。这就是我所使用的。我将提供代码，然后解释我的问题。

def _sample_fn(decoder_outputs):
     return decoder_outputs
def _end_fn(_):
     return tf.tile([False], [self.lstm_layersize]) # Batch-size is sequence-length because of time major
inference_helper = InferenceHelper(
     sample_fn=_sample_fn,
     sample_shape=[32],
     sample_dtype=target_input.dtype,
     start_inputs=tf.zeros(batchsize_placeholder, 32), # the batchsize varies
     end_fn=_end_fn
)
inference_decoder = BasicDecoder(
     cell=lstm_sequence_decoder,
     helper=inference_helper,
     initial_state=thought_vector
)
output, _, _ = dynamic_decode(
     decoder=inference_decoder,
     maximum_iterations=3000
)

问题

我不知道输入的形状是什么。我知道开始输入应该为零，因为这是第一步。但这会引发错误。它期望输入为(1,32)。

我还认为我应该将每个时间步的输出原样传递给下一个时间步。但是，这在运行时产生了问题：批处理大小不同，因此形状是局部的。该库在尝试将start_input转换为张量时抛出异常：

...
self._start_inputs = ops.convert_to_tensor(
      start_inputs, name='start_inputs')

有什么想法吗？

Answer 1

这是糟糕的文档课程。

我已解决问题，但未能解决批次大小可变的问题。

_end_fn引起了我不知道的问题。我还设法确定了InferenceHelper的适当字段。我给字段取了名字，以防将来有人需要指导

 def _end_fn(_):
      return tf.tile([False], [batchsize])
 inference_helper = InferenceHelper(
      sample_fn=_sample_fn,
      sample_shape=[lstm_number_of_units], # In my case, 32
      sample_dtype=tf.float32, # Depends on the data
      start_inputs=tf.zeros((batchsize, lstm_number_of_units)),
      end_fn=_end_fn
 )

对于批量大小问题，我正在考虑两件事：

更改模型对象的内部状态。 TensorFlow计算图内置在类中。一个类字段记录批处理大小。在训练过程中更改此设置可能会起作用。或者：
填充批次，使它们的长度为200个序列。这会浪费时间。

最好是一种动态管理批次大小的方法。

编辑：我找到了一种方法。它只需要用方括号代替括号即可：

 inference_helper = InferenceHelper(
      sample_fn=_sample_fn,
      sample_shape=[self.lstm_layersize],
      sample_dtype=target_input.dtype,
      start_inputs=tf.zeros([batchsize, self.lstm_layersize]),
      end_fn=_end_fn
 )

Seq2seq用于非句子，浮点数据；卡住配置解码器

1 个答案: