Keras fit_generator问题

时间:2018-11-23 02:54:36

标签: python keras generator

我跟随this tutorial为我的Keras模型创建了一个自定义生成器。这是显示我所面临问题的MWE:

import sys, keras
import numpy as np
import tensorflow as tf
import pandas as pd
from keras.models import Model
from keras.layers import Dense, Input
from keras.optimizers import Adam
from keras.losses import binary_crossentropy

class DataGenerator(keras.utils.Sequence):
    'Generates data for Keras'
    def __init__(self, list_IDs, batch_size, shuffle=False):
        'Initialization'
        self.batch_size = batch_size
        self.list_IDs = list_IDs
        self.shuffle = shuffle
        self.on_epoch_end()

    def __len__(self):
        'Denotes the number of batches per epoch'
        return int(np.floor(len(self.list_IDs) / self.batch_size))

    def __getitem__(self, index):
        'Generate one batch of data'
        # Generate indexes of the batch
        #print('self.batch_size: ', self.batch_size)
        print('index: ', index)
        sys.exit()

    def on_epoch_end(self):
        'Updates indexes after each epoch'
        self.indexes = np.arange(len(self.list_IDs))
        print('self.indexes: ', self.indexes)
        if self.shuffle == True:
            np.random.shuffle(self.indexes)

    def __data_generation(self, list_IDs_temp):
        'Generates data containing batch_size samples' # X : (n_samples, *dim, n_channels)

        X1 = np.empty((self.batch_size, 10), dtype=float)
        X2 = np.empty((self.batch_size, 12),  dtype=int)

        #Generate data
        for i, ID in enumerate(list_IDs_temp):
            print('i is: ', i, 'ID is: ', ID)

            #Preprocess this sample (omitted)
            X1[i,] = np.repeat(1, X1.shape[1])
            X2[i,] = np.repeat(2, X2.shape[1])

        Y = X1[:,:-1]
        return X1, X2, Y

if __name__=='__main__':
    train_ids_to_use = list(np.arange(1, 321)) #1, 2, ...,320 
    valid_ids_to_use = list(np.arange(321, 481)) #321, 322, ..., 480

    params = {'batch_size': 32}

    train_generator = DataGenerator(train_ids_to_use, **params)
    valid_generator = DataGenerator(valid_ids_to_use, **params)

    #Build a toy model
    input_1 = Input(shape=(3, 10))
    input_2 = Input(shape=(3, 12))
    y_input = Input(shape=(3, 10))

    concat_1 = keras.layers.concatenate([input_1, input_2])
    concat_2 = keras.layers.concatenate([concat_1, y_input])

    dense_1 = Dense(10, activation='relu')(concat_2)
    output_1 = Dense(10, activation='sigmoid')(dense_1)

    model = Model([input_1, input_2, y_input], output_1)
    print(model.summary())

    #Compile and fit_generator
    model.compile(optimizer=Adam(lr=0.001), loss=binary_crossentropy)
    model.fit_generator(generator=train_generator, validation_data = valid_generator, epochs=2, verbose=2)

我不想打乱我的输入数据。我以为这已经得到解决,但是在我的代码中,当我在index中打印出__get_item__时,我得到了随机数。我想要连续的数字。请注意,我正在尝试在sys.exit中使用__getitem__终止该进程,以查看发生了什么。

我的问题:

  1. 为什么index不能连续运行?我该如何解决?

  2. 当我在使用屏幕的终端中运行此命令时,为什么它不响应Ctrl + C?

1 个答案:

答案 0 :(得分:4)

您可以使用shuffle方法的fit_generator参数来连续生成批次。来自fit_generator() documentation

  

随机播放:布尔值。是否在每个纪元开始时重新整理批次的顺序。仅用于Sequencekeras.utils.Sequence)的实例。当steps_per_epoch不是None时无效。

只需将shuffle=False传递到fit_generator

model.fit_generator(generator=train_generator, shuffle=False, ...)