Question

我试图了解如何在训练期间使用Keras中的数据生成器。如果有一个设置，如

datagen = ImageDataGenerator()
datagen.fit(x_train)
model.fit_generator(datagen.flow(x_train, x_test, batch_size=32),
                    steps_per_epoch=100,
                    epochs=20)

我怎样才能了解“生成”了多少数据？何时？我无法理解batch_size和steps_per_epoch之间的关系。

以上是否等同于

for epoch 1 to 20:
    for each img in x_train:
        generate 100 morphed images based on img
        put these into batches of size 32
        fit each batch

或者，它可能是这样的：

for epoch 1 to 20:
    for each img in x_train:
        generate 100 morphed images based on img
    put all of the 100*x_train.shape[0] images into batches of size 32
    fit each batch

那么这是如何工作的呢？有没有办法调查/调试这个？

Answer 1

它的工作原理如下：

for epoch in range(20):
    for step in range(steps_per_epoch):
       yield x,y     
       #where x.shape = (32,imgshape1,imgshape2,imgshape3)
       #where y.shape = (32,your_output_shape....)

要确切了解生成器是如何创建批处理的（我怀疑是img1改变了，改变了img2，改变了img3 ......，可能改组），你可以：

gen = datagen.flow(x_train, x_test, batch_size=32)

for i in range(2*your_total_images):
    x,y = next(gen) #gets a batch
    useAPlottingLibraryAndPlot(x) #where x contains 32 images

Keras中的ImageDataGenerator和fit_generator：了解时代

1 个答案: