我正在使用Keras在260K图像的数据集上训练图像分类模型。之前,我将处理后的图像数据存储到pickle文件中,并编写了一个生成器以将其加载到模型中。然后,我发现从HDF5文件中读取数据的速度比从pickle文件中读取数据的速度快两倍。因此,我将数据存储到HDF5文件中,并更改了生成器,以便它从HDF5文件而不是Pickle文件读取和加载数据。但是训练速度变得比以前慢得多。拜托,有人可以告诉我原因,并帮助我解决问题吗?
Env:Keras 2.2.4,h5py 2.8.0,pickleshare 0.7.8,numpy 1.15.4 我已经手动检查了时间成本。两种情况下,生成器的输出类型都是相同的(numpy.darray,dtype = np.float64),我确认HDF5的读取速度是Pickle的两倍。
# Before, the parameters were x and y corresponding to the pickle files
def img_generator(batches):
while True:
for batch in batches:
data_batch = h5py.File(batch, 'r')
img_batch = data_batch['x'][()]
label_batch = data_batch['y'][()]
# img_batch = pk.load(open(x, 'br'))
# label_batch = pk.load(open(y, 'br'))
yield img_batch, label_batch
# Read data file paths
batch_fps = glob(IMG_DIR + 'batch_*')
batches_h5 = sorted(batch_fps, key=lambda x: int(x.split('_')[-1][:-3]))
# Build model and train
resnet_50 = ResnetBuilder.build_resnet_50((512, 512, 3), n_classes)
parallel_model = multi_gpu_model(resnet_50, GPU_COUNT) # GPU_COUNT=4
parallel_model.compile(loss='categorical_crossentropy',
optimizer=SGD(lr=0.01, momentum=0.9, decay=0, nesterov=True),
metrics=['accuracy', f1])
parallel_model.fit_generator(generator=img_generator(batches_h5),
epochs=NUM_EPOCH, steps_per_epoch=STEPS_PER_EPOCH,
callbacks=[checkpoint])
我希望培训时间成本比以前快30分钟,但是却变慢了2小时(之前是3.5小时)。