Question

您好，我正在尝试关注简单的Maluuba / qgen-workshop seq2seq模型，但我无法弄清楚应该传递给初始状态的正确batch_size是什么， >

  # Attention
  # attention_states: [batch_size, max_time, num_units]
attention_states = tf.transpose(encoder_outputs, [1, 0, 2])

  # Create an attention mechanism
attention_mechanism = tf.contrib.seq2seq.LuongAttention(
      encoder_cell.state_size, attention_states,
      memory_sequence_length=None)

decoder_cell = tf.contrib.seq2seq.AttentionWrapper(
      decoder_cell, attention_mechanism,
      attention_layer_size=encoder_cell.state_size)

batch = next(training_data())
batch = collapse_documents(batch)

initial_state = decoder_cell.zero_state(batch["size"], tf.float32).clone(cell_state=encoder_state)

decoder = seq2seq.BasicDecoder(decoder_cell, helper, initial_state, output_layer=projection)

它给我这个错误：

    InvalidArgumentError (see above for traceback): assertion failed: [When applying AttentionWrapper attention_wrapper_1: Non-matching batch sizes between the memory (encoder output) and the query (decoder output).

  Are you using the BeamSearchDecoder?  You may need to tile your memory input via the tf.contrib.seq2seq.tile_batch function with argument multiple=beam_width.] [Condition x == y did not hold element-wise:] [x (decoder/while/BasicDecoderStep/decoder/attention_wrapper/assert_equal/x:0) = ] [99] [y (LuongAttention/strided_slice_1:0) = ] [29]
     [[Node: decoder/while/BasicDecoderStep/decoder/attention_wrapper/assert_equal/Assert/Assert = Assert[T=[DT_STRING, DT_STRING, DT_STRING, DT_INT32, DT_STRING, DT_INT32], summarize=3, _device="/job:localhost/replica:0/task:0/cpu:0"](decoder/while/BasicDecoderStep/decoder/attention_wrapper/assert_equal/All, decoder/while/BasicDecoderStep/decoder/attention_wrapper/assert_equal/Assert/Assert/data_0, decoder/while/BasicDecoderStep/decoder/attention_wrapper/assert_equal/Assert/Assert/data_1, decoder/while/BasicDecoderStep/decoder/attention_wrapper/assert_equal/Assert/Assert/data_2, decoder/while/BasicDecoderStep/decoder/attention_wrapper/assert_equal/x, decoder/while/BasicDecoderStep/decoder/attention_wrapper/assert_equal/Assert/Assert/data_4, decoder/while/BasicDecoderStep/decoder/attention_wrapper/assert_equal/Equal/Enter)]]

Answer 1

我们目前有_MAX_BATCH_SIZE = 128，但每个批次都有不同的大小，因为我们要确保故事的所有问题都在同一批次中。因此，每个批次都有一个'size'键来指示其大小。

似乎您已经知道这一点。我认为问题是另外一回事。也许encoder_cell.state_size设置了较早批次的批次大小？

在seq2seq Maluuba模型中实现注意力机制

1 个答案: