我出于图像分类目的运行增强-使用Keras-如:
# Define Parameters
parameters = {"img_width" : 224,
"img_height": 224,
"epochs": 50,
"batch_size" : 15}
# Define Generators
train_datagen = ImageDataGenerator(
rescale = 1. / 255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True,
validation_split = 0.06)
test_datagen = ImageDataGenerator(
rescale=1/255)
# Define Flows from directories
train_generator = train_datagen.flow_from_directory(
directory = train_data_dir,
target_size=(parameters["img_width"], parameters["img_height"]),
batch_size = parameters["batch_size"],
class_mode= "categorical",
subset = "training",
color_mode = "rgb",
seed = 42)
validation_generator = train_datagen.flow_from_directory(
directory = train_data_dir,
target_size = (parameters["img_width"], parameters["img_height"]),
batch_size = parameters["batch_size"],
class_mode='categorical',
subset = "validation",
color_mode = "rgb",
seed = 42)
testing_generator = test_datagen.flow_from_dataframe(
dataframe = testing_df,
x_col="path", y_col="label",
class_mode="raw",
target_size= (parameters["img_width"], parameters["img_height"]),
shuffle = False,
batch_size= parameters["batch_size"])
,此代码将其输出作为训练,验证和测试的输出:找到了6911个类别的4911张图像。 找到282个属于69个类别的图像。 找到421个经过验证的图像文件名。
但是,如果我想使用test_datagen而不是train_datagen来验证数据,如下所示:
validation_generator = test_datagen.flow_from_directory(
# Changing Line
directory = train_data_dir,
target_size = (parameters["img_width"], parameters["img_height"]),
batch_size = parameters["batch_size"],
class_mode='categorical',
subset = "validation",
color_mode = "rgb",
seed = 42)
我得到的输出是:找到0个属于69个类别的图像。
如何解决此问题?简要地说,我想验证将在模型上有效运行的图像上的数据,因此使用仅缩放值的test_datagen。
P.s。 train_data_dir是一个文件夹,其中包含69个文件夹,其中包含来自不同类别的图像;
答案 0 :(得分:0)
我认为您不应该在同一目录中进行验证和培训。
尝试指向特定的验证目录,例如:
validation_generator = test_datagen.flow_from_directory(
# Changing Line
directory = validation_data_dir,
target_size = (parameters["img_width"], parameters["img_height"]),
batch_size = parameters["batch_size"],
class_mode='categorical',
subset = "validation",
color_mode = "rgb",
seed = 42)
目录应类似于:
train/
69 folders
validation/
69 folders
test/
69 folders
例如,我通常使用的设置是:
train_data_dir = (str(cwd) + r'\augmented\train\\')
validation_data_dir = (str(cwd) + r'\augmented\validation\\')
train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='binary')
history = model.fit_generator(
train_generator,
steps_per_epoch=nb_train_samples // batch_size,
epochs=epochs,
validation_data=validation_generator,
validation_steps=nb_validation_samples // batch_size)
要将图像扩展到单独的目录中,您可以执行以下操作,请注意这会有些乏味,建议您从类列表中创建一个循环。对于我的示例,我只进行了二进制分类(1或0)。我拍摄了一个“原始” 0图像,并在训练,验证和测试文件夹中进行了扩充,然后再次为1图像运行脚本。您有更多的类,因此建议您循环列表。
# rescaling is disabled to allow the images to be viewed
datagen = ImageDataGenerator(
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
# this is a PIL image # path + filename
img = load_img(r'path_to_single_image_to_be_augmented')
# this is a Numpy array with shape (3, 150, 150)
x = img_to_array(img)
# this is a Numpy array with shape (1, 3, 150, 150)
x = x.reshape((1,) + x.shape)
# the .flow() command below generates batches of randomly transformed
# images and saves the results to save_to_dir - remember to change prefix
i = 0
for batch in datagen.flow(x, batch_size=1,
save_to_dir=(str(cwd) + r'\augmented\test\0'),
save_prefix='0', save_format='jpeg'):
i += 1
if i > 110: # change the amount of augmented data you want here
break # otherwise the generator would loop indefinitely
i = 0
for batch in datagen.flow(x, batch_size=1,
save_to_dir=(str(cwd) + r'\augmented\test\0'),
save_prefix='0', save_format='jpeg'):
i += 1
if i > 280:
break
i = 0
for batch in datagen.flow(x, batch_size=1,
save_to_dir=(str(cwd) + r'\augmented\validation\0'),
save_prefix='0', save_format='jpeg'):
i += 1
if i > 280:
break