我正在尝试在Keras模型上进行K折交叉验证(使用ImageDataGenerator和flow_from_directory来训练和验证数据),我想知道“ ImageDataGenerator”中的参数“ validation_split”
test_datagen = ImageDataGenerator(
rescale=1. / 255,
rotation_range = 180,
width_shift_range = 0.2,
height_shift_range = 0.2,
brightness_range = (0.8, 1.2),
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True,
vertical_flip = True,
validation_split = 0.1
)
train_datagen = ImageDataGenerator(
rotation_range = 180,
width_shift_range = 0.2,
height_shift_range = 0.2,
brightness_range = (0.8, 1.2),
rescale = 1. / 255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True,
vertical_flip = True,
validation_split = 0.1
)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size = (img_width, img_height),
batch_size = batch_size,
class_mode ='binary',
seed = 42
)
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size = (img_width, img_height),
batch_size = batch_size,
class_mode = 'binary',
seed = 42
)
history = model.fit_generator(
train_generator,
steps_per_epoch = nb_train_samples // batch_size,
epochs = epochs,
validation_data = validation_generator,
validation_steps = nb_validation_samples // batch_size)
“ validation_split = 0.1”是否意味着我已经对数据集进行了10倍交叉验证?
答案 0 :(得分:0)
不。它只执行一次验证。来自official document:
validation_split :在0到1之间浮动。训练数据的小数部分用作验证数据。模型将分开训练数据的这一部分,不对其进行训练,并且将在每个时期结束时评估此数据的损失和任何模型度量。在进行混洗之前,从提供的x和y数据中的最后一个样本中选择验证数据。
因此将其设置为validation_split=0.1
只会使您的最后10%的数据免于训练,并将其用作验证集。
如果要进行k交叉验证,则必须手动进行。 这是一个很好的起点:Evaluate the Performance of Deep Learning Models in Keras