ValueError:lstm_cell_4层的输入0与该层不兼容:预期ndim = 2,找到的ndim = 3。收到的完整图形:[128,无,200]

时间:2018-08-08 18:29:56

标签: tensorflow nlp deep-learning chatbot seq2seq

因此,我遵循了github上的一个教程来创建序列模型。我正在尝试使用康奈尔电影对话数据集为聊天机器人训练模型。因此,当我运行程序时,出现此错误

  

ValueError:lstm_cell_4层的输入0与该层不兼容:预期ndim = 2,找到的ndim = 3。收到的完整图形:[128,无,200]

我不知道问题出在哪里。

输入数据集的维数为:(128,20)

我认为程序中的某些功能存在问题:

def decoding_layer_infer(dec_cell, keep_prob, dec_embeddings, batch_size, start_of_sequence_id,
                     end_of_sequence_id, output_layer, encoder_state, max_target_sequence_length):

dec_cell = tf.contrib.rnn.DropoutWrapper(dec_cell, 
                                         output_keep_prob=keep_prob)

helper = tf.contrib.seq2seq.GreedyEmbeddingHelper(dec_embeddings, 
                                                  tf.fill([batch_size], start_of_sequence_id), 
                                                  end_of_sequence_id)

decoder = tf.contrib.seq2seq.BasicDecoder(dec_cell, 
                                          helper, 
                                          encoder_state, 
                                          output_layer)

outputs, _, _ = tf.contrib.seq2seq.dynamic_decode(decoder, 
                                                  impute_finished=True, 
                                                  maximum_iterations=max_target_sequence_length)
return outputs

Func_2

def decoding_layer(target_vocab_to_int, decoding_embedding_size, dec_input, rnn_size,
               keep_prob, target_seq_length, encoder_state,
               max_target_sequence_length, batch_size, num_layers):

target_vocab_size = len(target_vocab_to_int)
dec_embeddings = tf.Variable(tf.random_uniform([target_vocab_size, decoding_embedding_size]))
dec_embed_input = tf.nn.embedding_lookup(dec_embeddings, dec_input)

cells = tf.contrib.rnn.MultiRNNCell([tf.contrib.rnn.LSTMCell(rnn_size) for _ in range(num_layers)])

with tf.variable_scope('decoder'):
    output_layer = tf.layers.Dense(target_vocab_size)
    train_outputs = decoding_layer_train(cells, keep_prob,
                                         dec_embed_input,
                                         target_seq_length,
                                         encoder_state,
                                         output_layer,
                                         max_target_sequence_length)

with tf.variable_scope('decoder', reuse=True):
    infer_outputs = decoding_layer_infer(cells, keep_prob, 
                                         dec_embed_input, 
                                         batch_size, 
                                         target_vocab_to_int['<GO>'], 
                                         target_vocab_to_int['<EOS>'], 
                                         output_layer,
                                         encoder_state, 
                                         max_target_sequence_length)

return train_outputs, infer_outputs

Func_3

def seq2seq_model(input_data, source_vocab_size, encoding_emb_size, num_layers, 
              keep_prob, rnn_size, target_data,
              target_vocab_to_int, batch_size, 
              decoding_embedding_size, target_seq_length, max_target_sequence_length):


encoder_outputs, encoder_states = encoding_layer(input_data,
                                                 source_vocab_size,
                                                 encoding_emb_size,
                                                 rnn_size,
                                                 num_layers,
                                                 keep_prob)

dec_input = process_decoder_input(target_vocab_to_int,
                                  target_data, batch_size)


train_output, infer_output = decoding_layer(target_vocab_to_int,
                                            decoding_embedding_size,
                                            dec_input, rnn_size,
                                            keep_prob,
                                            target_seq_length,
                                            encoder_states, 
                                            max_target_sequence_length, 
                                            batch_size, num_layers)

return train_output, infer_output

完整代码如下:Full code excluding dataset preprocessing

请帮帮我。

0 个答案:

没有答案