如何在内存有限的大型数据集上运行predict_generator?

时间:2018-01-17 10:43:53

标签: tensorflow machine-learning keras imagenet

目前我正在将所有图像一次性送到predict_generator。 我希望能够提供存储在validation_generator中的一小组图像,并对它们进行预测,以便大型数据集没有内存问题。我该如何更改以下代码?

top_model_weights_path = '/home/rehan/ethnicity.071217.23-0.28.hdf5'
path = "/home/rehan/countries/pakistan/guys/"
img_width, img_height = 139, 139
confidence = 0.8
model = applications.InceptionResNetV2(include_top=False, weights='imagenet',
                                       input_shape=(img_width, img_height, 3))
print("base pretrained model loaded")


validation_generator = ImageDataGenerator(rescale=1./255).flow_from_directory(path, target_size=(img_width, img_height),
                                                        batch_size=32,shuffle=False)
print("validation_generator")
features = model.predict_generator(validation_generator,steps=10)

1 个答案:

答案 0 :(得分:1)

我在对象上运行循环,然后将数据存储在列表中以消除内存问题。

   validation_generator= ImageDataGenerator(rescale=1./255).flow_from_directory(path, target_size=(img_width, img_height),
                                                                     batch_size=32,shuffle=False)
    prediction_proba1=[]
    prediction_classes1=[]
    print("validation_generator")
    print(len(validation_generator))
    for i in range(len(validation_generator)):
        print (" array coming...")
        #print(validation_generator[i])
        kl = validation_generator[i]
        print(kl)
        print("numpy array")
        print(kl[0])
        features = model.predict_on_batch(kl[0])
        print("features")
        print(features)
        prediction_proba = model1.predict_proba(features)
        prediction_classes = model1.predict_classes(features)
        prediction_classes1.extend(prediction_classes)
        prediction_proba1.extend(prediction_proba)
        #print(prediction_proba1)
        print(prediction_classes1)