Question

我目前正在从事与CNN相关的项目，在那里我是该特定领域的新手。我喜欢一组包含500张织物缺陷图像的图像。如何增加最多2000张图片的数量？我可以在此使用任何库吗？

Answer 1

有不同的数据增强技术，例如缩放，镜像，旋转，裁剪等。其想法是从您的初始图像集中创建新图像，以便模型必须考虑到这些变化所引起的新信息。

可以通过多种方式做到这一点，第一个是OpenCV，然后您可以在Tensorflow之上使用Keras，后者为数据生成或scikit图像提供了内置的高级功能。

我建议先从简单有效的技术开始，例如镜像和随机裁剪，然后继续进行颜色或对比度增强。

文档和文章：

Answer 2

用于图像增强的首选库是imgaug。

文档是自我解释的，但这是一个示例：


import numpy as np
from imgaug import augmenters as iaa
from PIL import Image

# load image and convert to matrix
image = np.array(Image.open("<path to image>"))

# convert image to matrix
# image must passed into a list because you can also put a list of multiple images into the augmenter, but for this demonstration we will only take one.
image = [image]

# all these augmentation techniques will applied with a certain probability
augmenter = iaa.Sequential([
    iaa.Fliplr(0.5), # horizontal flips
    iaa.Crop(percent=(0, 0.1)), # random crops

    iaa.Sometimes(
        0.5,
        iaa.GaussianBlur(sigma=(0, 0.5))
    ),

    iaa.AdditiveGaussianNoise(loc=0, scale=(0.0, 0.05*255), per_channel=0.5),

], random_order=True) # apply augmenters in random order

augmented_image = augmenter(images=image)

augmented_image现在是一个列表，其中包含原始图像的一个增强图像。因为您说过要从500张图像中创建2000张，所以您可以执行以下操作：您将每个图像放大4次，即：


total_images = []
for image_path in image_paths:
    image = Image.load(image_path)

    # create a list with for times the same image
    images = [image for i in range(4)]
    
    # pass it into the augmenter and get 4 different augmentations
    augmented_images = augmenter(images=images)
    
    # add all images to a list or save it otherwise
    total_images += augmented_images

使用Python进行数据扩充

2 个答案: