是否可以获取使用flow_from_directory
加载的文件名?
我有:
datagen = ImageDataGenerator(
rotation_range=3,
# featurewise_std_normalization=True,
fill_mode='nearest',
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True
)
train_generator = datagen.flow_from_directory(
path+'/train',
target_size=(224, 224),
batch_size=batch_size,)
我的多输出模型有一个自定义生成器,如:
a = np.arange(8).reshape(2, 4)
# print(a)
print(train_generator.filenames)
def generate():
while 1:
x,y = train_generator.next()
yield [x] ,[a,y]
节点我目前正为a
生成随机数但是对于真正的训练,我希望加载一个json
文件,其中包含我的图像的边界框坐标。为此,我需要获取使用train_generator.next()
方法生成的文件名。完成后,我可以加载文件,解析json
并传递它而不是a
。 x
变量的排序和我得到的文件名列表也是相同的。
答案 0 :(得分:21)
是的,至少在2.0.4版本(不了解早期版本)是可能的。
ipconfig /release
ipconfig /all
ipconfig /flushdns
ipconfig /renew
netsh int ip set dns
netsh winsock reset ###( Need to run in administrator mode)
的实例具有ImageDataGenerator().flow_from_directory(...)
的属性,该属性是生成器生成它们的顺序中的所有文件的列表,还有属性filenames
。所以你可以这样做:
batch_index
在生成器上的每次迭代中,您都可以获得相应的文件名:
datagen = ImageDataGenerator()
gen = datagen.flow_from_directory(...)
这将为您提供当前批次中图像的文件名。
答案 1 :(得分:2)
以下是一个与shuffle=True
一起使用的示例。并且还可以正确处理最后一批。要通过:
datagen = ImageDataGenerator().flow_from_directory(...)
batches_per_epoch = datagen.samples // datagen.batch_size + (datagen.samples % datagen.batch_size > 0)
for i in range(batches_per_epoch):
batch = next(datagen)
current_index = ((datagen.batch_index-1) * datagen.batch_size)
if current_index < 0:
if datagen.samples % datagen.batch_size > 0:
current_index = max(0,datagen.samples - datagen.samples % datagen.batch_size)
else:
current_index = max(0,datagen.samples - datagen.batch_size)
index_array = datagen.index_array[current_index:current_index + datagen.batch_size].tolist()
img_paths = [datagen.filepaths[idx] for idx in index_array]
#batch[0] - x, batch[1] - y, img_paths - absolute path
答案 2 :(得分:1)
您可以通过继承DirectoryIterator
来制作一个返回image, file_path
元组的非常小的子类:
import numpy as np
from keras.preprocessing.image import ImageDataGenerator, DirectoryIterator
class ImageWithNames(DirectoryIterator):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.filenames_np = np.array(self.filepaths)
self.class_mode = None # so that we only get the images back
def _get_batches_of_transformed_samples(self, index_array):
return (super()._get_batches_of_transformed_samples(index_array),
self.filenames_np[index_array])
在初始化中,我添加了一个属性,它是self.filepaths
的numpy版本,以便我们可以轻松地索引到该数组中,以获取每个批次生成的路径。
对基类的唯一其他更改是返回一个元组,该元组是图像批处理super()._get_batches_of_transformed_samples(index_array)
和文件路径self.filenames_np[index_array]
。
有了它,您可以像这样使生成器:
imagegen = ImageDataGenerator()
datagen = ImageWNames('/data/path', imagegen, target_size=(224,224))
然后与
联系next(datagen)
答案 3 :(得分:1)
至少在2.2.4版中,您可以这样做
datagen = ImageDataGenerator()
gen = datagen.flow_from_directory(...)
for file in gen.filenames:
print(file)
或获取文件路径
for filepath in gen.filepaths:
print(filepath)
答案 4 :(得分:1)
以下代码可能会有所帮助。覆盖flow_from_directory
class AugmentingDataGenerator(ImageDataGenerator):
def flow_from_directory(self, directory, mask_generator, *args, **kwargs):
generator = super().flow_from_directory(directory, class_mode=None, *args, **kwargs)
seed = None if 'seed' not in kwargs else kwargs['seed']
while True:
for image_path in generator.filepaths:
# Get augmentend image samples
image = next(generator)
# print(image_path )
yield image,image_path
# Create training generator
train_datagen = AugmentingDataGenerator(
rotation_range=10,
width_shift_range=0.1,
height_shift_range=0.1,
rescale=1./255,
horizontal_flip=True
)
train_generator = train_datagen.flow_from_directory(
TRAIN_DIRECTORY_PATH,
target_size=(256, 256),
shuffle = False,
batch_size=BATCH_SIZE
)
# Create testing generator
test_datagen = AugmentingDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
TEST_DIRECTORY_PATH,
target_size=(256, 256),
shuffle = False, # inorder to get imagepath of the same image
batch_size=BATCH_SIZE
)
并检查返回的图像和文件路径
image,file_path = next(test_generator)
# print(file_path)
# plt.imshow(image)
答案 5 :(得分:0)
我确实需要这个,并且我开发了一个简单的函数可以与shuffle=True
或shuffle=False
一起使用。
def get_indices_from_keras_generator(gen, batch_size):
"""
Given a keras data generator, it returns the indices and the filepaths
corresponding the current batch.
:param gen: keras generator.
:param batch_size: size of the last batch generated.
:return: tuple with indices and filenames
"""
idx_left = (gen.batch_index - 1) * batch_size
idx_right = idx_left + gen.batch_size if idx_left >= 0 else None
indices = gen.index_array[idx_left:idx_right]
filenames = [gen.filenames[i] for i in indices]
return indices, filenames
然后,您将按以下方式使用它:
for x, y in gen:
indices, filenames = get_indices_from_keras_generator(gen)