Question

我想创建一个Seq2Seq模型来预测时间序列数据。我正在使用InferenceHelper，并且正在使用sample_fn参数。我想将每个单元的解码器输出通过密集层，以便在每个时间步生成单个输出。因此，我提供了一个对sample_fn参数执行此操作的函数。

稍后，我想将rnn单元输出与其他非时间序列特征连接起来，并在其之上构建更密集的层。

网络在训练时表现良好，但在推理期间却表现不佳。我认为这是由于我在训练和推理时间之间没有共享相同的密集层而造成的。

我尝试设置重用参数，并使用了with tf.variable_scope()环境。但是，sample_fn已在dynamic_decode中的特定范围内被调用，因此我无法使用与培训期间相同的范围。

我的代码的相关部分如下所示：

占位符：

inputs = tf.placeholder(shape=(None, 100, 1), dtype=tf.float32, name='inputs')
input_lengths = tf.placeholder(shape=(None,), dtype=tf.int32, name='input_lengths')

targets = tf.placeholder(shape=(None, 100), dtype=tf.float32, name='targets')
target_lengths = tf.placeholder(shape=(None,), dtype=tf.int32, name='target_lengths')

编码器：

encoder_cell = tf.nn.rnn_cell.MultiRNNCell([tf.contrib.rnn.GRUCell(num_units=16, name='encoder_cell_0'])
self.decoder_cell = tf.nn.rnn_cell.MultiRNNCell([tf.contrib.rnn.GRUCell(num_units=16, name='decoder_cell_0']))

_, final_encoder_states = tf.nn.dynamic_rnn(cell=encoder_cell, inputs=inputs,
                                                sequence_length=input_lengths, dtype=tf.float32)

解码器（训练）

start_tokens = tf.fill([tf.shape(inputs)[0]], start_token)
start_tokens = tf.cast(tf.expand_dims(start_tokens, 1), dtype=tf.float32)
targets_as_inputs = tf.concat([start_tokens, targets], axis=1)
targets_as_inputs = tf.reshape(targets_as_inputs, (-1, targets_as_inputs.shape[1], 1))

training_helper = tf.contrib.seq2seq.TrainingHelper(inputs=targets_as_inputs, sequence_length=target_lengths, name='training_helper')
training_decoder = tf.contrib.seq2seq.BasicDecoder(cell=decoder_cell, helper=training_helper, initial_state=final_encoder_states)

train_outputs, _, _ = tf.contrib.seq2seq.dynamic_decode(decoder=training_decoder, maximum_iterations=max_target_sequence_length, impute_finished=True)

train_predictions = train_outputs.rnn_output
train_predictions = tf.layers.dense(train_predictions, 1, activation=None, name='output_dense_layer')

解码器（推断）。不正确的部分：

def sample_fn(outputs):
    return tf.layers.dense(outputs, 1, activation=None,         
                           name='output_dense_layer', reuse=tf.AUTO_REUSE)

infer_helper = tf.contrib.seq2seq.InferenceHelper(sample_fn=sample_fn, sample_shape=(1), 
                                                       sample_dtype=tf.float32, start_inputs=start_tokens, end_fn=lambda sample_ids: False, next_inputs_fn=None)
infer_decoder = tf.contrib.seq2seq.BasicDecoder(cell=decoder_cell, helper=infer_helper, initial_state=final_encoder_states)

infer_outputs, _, _ = tf.contrib.seq2seq.dynamic_decode(decoder=infer_decoder, maximum_iterations=max_target_sequence_length, impute_finished=True)

infer_predictions = infer_outputs.rnn_output
infer_predictions = sample_fn(infer_predictions)

还有一个类似的问题：How to use tensorflow seq2seq without embeddings?

作者使用sample_fn=lambda outputs: outputs。但这在我的情况下返回ValueError，因为尺寸不匹配。他们怎么会有多个细胞？ sample_fn应该返回一个值。

Answer 1

目前，我已经通过创建自己的dynamic_decode函数解决了我的问题。我复制了

旁边的所有内容

with variable_scope.variable_scope(scope, "decoder", reuse=reuse) as varscope:

以及一个与varscope相关的if条件，以及另一个if条件测试来自tf.contrib.seq2seq.dynamic_decode的解码器类。

这不是一个很好的解决方案，但到目前为止已经足够了。

在下一步提供Tensorflow Seq2Seq输出作为输入（推论）

1 个答案: