Question

我有一个庞大的图像数据集，我想将其拆分为一个补丁网格，例如10乘10补丁的网格。对于这些不同的补丁，我想训练一个自动编码器，因此在10x10网格的情况下，我将有100个自动编码器。

到目前为止，我的解决方案是为每个补丁创建一个新的ImageDataGenerator。但是我认为这样效率太低，因为尽管只需要一个补丁，但所有图像都需要完全加载100次（每个自动编码器一次）。理论上，一次就足够了。有没有更好的方法，我看不到？预先感谢！

def crop_to_patch_function(patch_x: int, patch_y: int, grid_size: int):
    def crop_to_patch(img):
        x, y = patch_x*grid_size, patch_y*grid_size
        return img[y:(y+grid_size), x:(x+grid_size), :]
    return crop_to_patch


def patch_generator(patch_x, patch_y, grid_size):
    datagen = ImageDataGenerator(rescale=1/255)
    train_batches_tmp = datagen.flow_from_directory(
        directory=train_data_dir,
        target_size=(img_height, img_width),
        batch_size=batch_size,
        color_mode='rgb',
        class_mode='input',
    )
    while True:
        batch_x, batch_y = next(train_batches_tmp)
        batch_patches = np.zeros((batch_x.shape[0], grid_size, grid_size, 3))
        for i in range(batch_x.shape[0]):
            batch_patches[i] = crop_to_patch_function(patch_x, patch_y, grid_size)(batch_x[i])
        yield (batch_patches, batch_patches)


# batches of patch at pos (2, 4)
patch_x, patch_y = 2, 4
train_patch_batches = patch_generator(patch_x, patch_y, grid_size)

Answer 1

通过预先创建补丁来预处理图像不起作用？将它们保存到其他目录，并将每个ImageDataGenerator分配到100个目录之一，为每个模型加载数据。

类似的东西：

def images_to_patches(images_list):
    for idx,image in enumerator(images_list):
        for patch_x in range(10):
            for patch_y in range(10):
                //returns the patch image
                patch_img = crop_patch(image, patch_x, patch_y, grid_size)
                img_dir = str(patch_x)+str(patch_y)
                patch_img.save(os.path.join(img_dir,idx))

Keras：用于图像的不同网格斑块的不同生成器

1 个答案: