处理在keras中加载大量数据

时间:2017-09-06 08:23:46

标签: python memory keras gpu

我有这个keras代码,它使用keras中提供的一些预训练模型来预测某些图像数据的类别。

例如,您可以像这样调用函数:

pre_features(Xception, lf_path, uf_path, batch_size, models_path, (299, 299), xception.preprocess_input)  

这意味着它将使用Xception keras模型来预测lf_path和uf_path中文件的功能,使用图像的数据扩充到(299,299,3)

该功能的代码是:

def pre_features(MODEL, lf_path, uf_path, batch_size, models_path, image_size=(224,224), lambda_func=None):
    width = image_size[0]
    height = image_size[1]
    input_tensor = Input((height, width, 3))
    x = input_tensor
    if lambda_func:
        x = Lambda(lambda_func)(x)
    base_model = MODEL(input_tensor=x, weights='imagenet', include_top=False)
    model = Model(base_model.input, GlobalAveragePooling2D()(base_model.output))

    gen = ImageDataGenerator()
    train_generator = gen.flow_from_directory(join(lf_path,'train_set'), image_size, shuffle=False, 
                                              batch_size=batch_size)
    valid_generator = gen.flow_from_directory(join(lf_path,'valid_set'), image_size, shuffle=False, 
                                              batch_size=batch_size)
    unlabeled_generator = gen.flow_from_directory(uf_path, image_size, shuffle=False, 
                                             batch_size=batch_size, class_mode=None)

    train = model.predict_generator(train_generator, train_generator.samples)
    valid = model.predict_generator(valid_generator, valid_generator.samples)
    unlabeled = model.predict_generator(unlabeled_generator, unlabeled_generator.samples)
    with h5py.File(join(models_path,typ+"_gap_%s.h5"%MODEL.__name__)) as h:
        h.create_dataset("train", data=train)
        h.create_dataset("valid", data=valid)
        h.create_dataset("unlabeled", data=unlabeled)
        h.create_dataset("label", data=train_generator.classes)
        h.create_dataset("val_label", data=valid_generator.classes)

现在我使用ImageDataGenerator批量加载磁盘中的数据,然后我预测这些类。

我遇到的问题是我在运行此代码的网格的GPU上内存不足,我得到了一堆像这样的错误

2017-09-06 01:05:01.216563: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gp
u:0) -> (device: 0, name: TITAN X (Pascal), pci bus id: 0000:03:00.0)
2017-09-06 01:05:03.984894: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gp
u:0) -> (device: 0, name: TITAN X (Pascal), pci bus id: 0000:03:00.0)
2017-09-06 02:36:05.106652: E tensorflow/stream_executor/cuda/cuda_driver.cc:955] failed to alloc 4294967296 bytes o
n host: CUDA_ERROR_INVALID_VALUE
2017-09-06 02:36:05.135916: W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned h
ost memory of size: 4294967296
2017-09-06 02:36:06.234000: E tensorflow/stream_executor/cuda/cuda_driver.cc:955] failed to alloc 3865470464 bytes o
n host: CUDA_ERROR_INVALID_VALUE

最后我得到了

Job killed after exceeding memory limits

我怀疑问题是当我尝试预测未标记生成器的输出时,因为未标记的数据比其他生成器大(大约31000张图像)。

预测此内存问题的最佳方法是什么?有什么建议吗?

0 个答案:

没有答案