我正在使用以下生成器:
datagen = ImageDataGenerator(
fill_mode='nearest',
cval=0,
rescale=1. / 255,
rotation_range=90,
width_shift_range=0.1,
height_shift_range=0.1,
zoom_range=0.5,
horizontal_flip=True,
vertical_flip=True,
validation_split = 0.5,
)
train_generator = datagen.flow_from_dataframe(
dataframe=traindf,
directory=train_path,
x_col="id",
y_col=classes,
subset="training",
batch_size=8,
seed=123,
shuffle=True,
class_mode="other",
target_size=(64,64))
STEP_SIZE_TRAIN = train_generator.n // train_generator.batch_size
valid_generator = datagen.flow_from_dataframe(
dataframe=traindf,
directory=train_path,
x_col="id",
y_col=classes,
subset="validation",
batch_size=8,
seed=123,
shuffle=True,
class_mode="raw",
target_size=(64, 64))
STEP_SIZE_VALID = valid_generator.n // valid_generator.batch_size
现在的问题是,验证数据也在增加,我想这不是您在训练时想要做的事情。如何避免这种情况?我没有两个用于培训和验证的目录。我想使用一个数据框来训练网络。有什么建议吗?
答案 0 :(得分:1)
我的朋友发现的解决方案是使用其他生成器,但具有相同的验证拆分,并且没有改组。
datagen = ImageDataGenerator(
#featurewise_center=True,
#featurewise_std_normalization=True,
rescale=1. / 255,
rotation_range=90,
width_shift_range=0.1,
height_shift_range=0.1,
zoom_range=0.5,
horizontal_flip=True,
vertical_flip=True,
validation_split = 0.15,
)
valid_datagen=ImageDataGenerator(rescale=1./255,validation_split=0.15)
然后可以将两个生成器定义为
train_generator = datagen.flow_from_dataframe(
dataframe=traindf,
directory=train_path,
x_col="id",
y_col=classes,
subset="training",
batch_size=64,
seed=123,
shuffle=False,
class_mode="raw",
target_size=(224,224))
STEP_SIZE_TRAIN = train_generator.n // train_generator.batch_size
valid_generator = valid_datagen.flow_from_dataframe(
dataframe=traindf,
directory=train_path,
x_col="id",
y_col=classes,
subset="validation",
batch_size=64,
seed=123,
shuffle=False,
class_mode="raw",
target_size=(224, 224))
STEP_SIZE_VALID = valid_generator.n // valid_generator.batch_size
答案 1 :(得分:0)
只需稍做更改,即可解决此问题。您可以再添加一个名为test_datagen的ImageDataGenerator对象,在该对象中,您将仅传递rescale参数,而不使用增强技术。因此,扩充技术将针对您的数据源位于不同的对象中。您还必须先拆分训练和测试目录,然后再将其传递给训练和测试数据生成器。 我正在给您TensorFLow的示例代码,您也可以参考this。
#For traning data
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
#For testing data
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
'data/train',
target_size=(150, 150),
batch_size=32,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
'data/validation',
target_size=(150, 150),
batch_size=32,
class_mode='binary')
model.fit_generator(
train_generator,
steps_per_epoch=2000,
epochs=50,
validation_data=validation_generator,
validation_steps=800)
答案 2 :(得分:0)
您应该看到此相关问题的答案:When using Data augmentation is it ok to validate only with the original images?
它表示在加载验证数据时,将 ImageDataGenerator 与空参数结合使用,例如:
train_gen = ImageDataGenerator(aug_params).flow_from_directory(train_dir)
valid_gen = ImageDataGenerator().flow_from_directory(valid_dir)
model.fit_generator(train_gen, validation_data=valid_gen)