如何使用“输出列表”来提供模型?

时间:2017-02-15 20:10:42

标签: keras

对不起标题,但我不能在这里找到更好的描述。

我正在尝试在一个模型上应用批次进行培训,该模型应该有13个完全连接的输出层。每个输出层只有两个节点(但是如上所述完全连接)。

构建模型的输出如下所示:

outputs = list()

for i in range(num_labels):
    out_y = Dense(2, activation='softmax', name='out_{:d}'.format(i))(convolution_layer)
    outputs.append(out_y)

self.model = Model(input=inputs, output=outputs)

但是,我无法设法提供此模型。我试着使用[batch_size, 13, 1, 2]大小的输出数组:

y = np.zeros((batch_size, 13, 1, 2))

但对于一批2号的我得到:

ValueError: The model expects 13 input arrays, but only received one array. Found: array with shape (2, 13, 1, 2)

我已经尝试过其他一些东西但是我不清楚模型的输入是怎么样的。

我该如何训练这个模型?

我还尝试传递numpy数组列表:

enter image description here

其中批处理的第一级表示样本(此处为2),第二级是具有13个numpy数组列表的样本。但我得到了:

ValueError: Error when checking model target: you are passing a list as input to your model, but the model expects a list of 13 Numpy arrays instead. The list you passed was: [[array([ 0.,  1.]), array([ 0.,  1.]), array([ 0.,  1.]), array([ 0.,  1.]), array([ 0.,  1.]), array([ 0.,  1.]), array([ 0.,  1.]), array([ 0.,  1.]), array([ 0.,  1.]), array([ 1.,  0.]), array([ 

根据建议,我还尝试返回list()个大小为[13,2]的numpy数组:

enter image description here

错误变为:

ValueError: Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 13 arrays but instead got the following list of 2 arrays: [array([[ 0.,  1.],
       [ 0.,  1.],
       [ 0.,  1.],
       [ 0.,  1.],
       [ 0.,  1.],
       [ 0.,  1.],
       [ 0.,  1.],
       [ 0.,  1.],
       [ 0.,  1.],
       [ 1.,  0.],
       [ ...

代码

您可以在下面找到当前代码,该代码在sample_generator中生成一个样本,在batch_generator中生成完整批次(使用sample_generator)。

def batch_generator(w2v,file_path,meta_info,batch_size,sample_generator_fn,embedding_size):

请注意:代码现在显示我如何生成list() [13,2]个ndarrays,而该列表中此类ndarray的数量由batch_size定义

    try:

    x = np.zeros((batch_size, meta_info.max_sequence_length, embedding_size, 1))
    y = list() #np.zeros((batch_size, 13, 1, 2))

    file = open(file_path)

    while True:

        x[:] = 0.0
        #y[:] = 0.0

        for batch in range(batch_size):

            sentence_info_json = file.readline()

            if sentence_info_json == '':
                file.seek(0)
                sentence_info_json = file.readline()

            sample = sample_generator_fn(w2v, sentence_info_json, meta_info)

            if not sample:
                continue

            sentence_embedding = sample[0]

            final_length = len(sentence_embedding)

            x[batch, :final_length, :, 0] = sentence_embedding
            y.append(sample[1])

        shuffled = np.asarray(range(batch_size))
        np.random.shuffle(shuffled)

        x = x[shuffled]
        #y = y[shuffled]
        y = [y[i] for i in shuffled]

        yield x, y

    except Exception as e:
    print('Error in generator.')
    print(e)
    raise e


def sample_generator(w2v, sentence_info_json, meta_info):

    if not sentence_info_json:
    print('???')

    sentence_info = json.loads(sentence_info_json)

    tokens = [token['word'] for token in sentence_info['corenlp']['tokens']]
    sentence = Sentence(tokens=tokens)

    sentence_embedding = w2v.get_word_vectors(sentence.tokens.tolist())
    sentence_embedding = np.asarray([word_vector for word_vector in sentence_embedding if word_vector is not None])

    final_length = len(sentence_embedding)

    if final_length == 0:
    return None

    y = np.zeros((2, len(meta_info.category_dict)))
    y[1, :] = 1.

    #y_list = []

    y_tar = np.zeros((len(meta_info.category_dict), 2))

    for i in range(len(meta_info.category_dict)):
    y_tar[i][1] = 1.0
    # y_list.append(np.asarray([0.0, 1.0]))

    for opinion in sentence_info['opinions']:
    index = meta_info.category_dict[opinion['category']]

    y_tar[index][0] = 1.0
    y_tar[index][1] = 0.0

    #y_list[index][0] = 1.0
    #y_list[index][1] = 0.0

    return sentence_embedding, y_tar

根据要求,致电fit_generator()

cnn.model.fit_generator(generator=batch_generator(word2vec,
                                                  train_file, train_meta_info,
                                                  num_batches, sample_generator,
                                                  embedding_size),
                        samples_per_epoch=2000,
                        nb_epoch=2,
                        # validation_data=batch_generator(test_file_path, train_meta_info),
                        # nb_val_samples=100,
                        verbose=True)

1 个答案:

答案 0 :(得分:3)

您的输出应该是错误中指定的列表。列表的每个元素都应该是一个大小为[batch_size, nb_outputs]的numpy数组。因此,在您的案例中列出了13个大小为[batch_size,2]的元素。