Question

在我的计算机视觉应用程序中，我一直按以下顺序进行图像操作：

使用双线性插值调整大小
作物

当每个图像中存在多个感兴趣区域（AoI）时，效率非常低，每个AoI需要以不同的比例调整大小。

E.g：

from PIL import Image
import numpy as np
from scipy.misc import imresize

img_array = np.array(Image.open('foo.png').convert('L'))

# [top, left, bottom, right]
areas_of_interest = np.array([
    [10.0, 14.0, 55.0, 70.0],
    [81.0, 33.0, 170.0, 88.0],
])

# [vertical scale, horizontal scale]
scales = np.array([
    [1.5, 2.3],
    [1.2, 1.9],
])

for aoi, scale in zip(areas_of_interest, scales):
    new_shape = np.array(img_array.shape) * scale
    resized_img = imresize(
        img_array,
        new_shape,
        interp='bilinear',
    )

    scaled_aoi = aoi * np.tile(scale, reps=2)
    cropped_img = resized_img[
        scaled_aoi[0]:scaled_aoi[2],
        scaled_aoi[1]:scaled_aoi[3],
    ]

    do_something_with_img(cropped_img)

有没有办法可以反转裁剪的顺序并调整操作，同时保持生成的图像的字节完美标识？

如果我只是简单地裁剪然后调整大小，结果图像与旧方法不同，原因有两个：

边框缺少要执行插值的像素数据;
不考虑原始图像的AoI偏移。

结果是我得到的图像看起来与原始方法非常相似，但尺寸不一定相同，并且由于像素别名，像素值也不同。

我相信，在理论上，应该有可能得到与原始方法相同的结果，因为双线性插值的工作方式。

我认为我需要三个解决方案的成分：

在裁剪时在AoI的每一边都包括一个边距;
考虑了AoI w.r.t的偏移量。执行插值时的原始图像;
调整大小后正确裁剪边距;

数字1和3应该很容易，但我不知道任何为我做2号的库，我真的不想在numpy或cython中滚动我自己的双线性图像缩放。

有没有人知道如何在不跳过太多篮球的情况下获得第二名？我可以访问PIL，通常的numpy / scipy堆栈和OpenCV（cv2）。

编辑：添加一个具体的例子，在Jupyter笔记本中运行它

from PIL import Image
import numpy as np
from scipy.misc import imresize

def nbimage(data, mode):
    '''Display raw data as a notebook inline image.

    Parameters:
    data: array-like object, two or three dimensions. If three dimensional,
          first or last dimension must have length 3 or 4 and will be
          interpreted as color (RGB or RGBA).
    mode: the PIL image mode
    '''
    from IPython.display import display, Image as IpyImage
    from PIL.Image import fromarray
    from StringIO import StringIO

    s = StringIO()
    fromarray(data, mode=mode).save(s, 'png')
    display(IpyImage(s.getvalue()))

# This is http://www.pdphoto.org/PictureDetail.php?mat=pdef&pg=7917
img =Image.open('zoo_1_bg_012604.jpg').convert('L')

img_array = np.array(img)
nbimage(img_array, mode='L')

定义感兴趣的区域和尺度：

areas_of_interest = np.array([
    [155, 623, 155 + 114, 623 + 134],
    [567, 605, 136 + 567, 605 + 190],
], dtype=np.float)

scales = np.array([
    [1.5, 2.3],
    [1.2, 1.9],
])

调整大小然后裁剪：

resize_then_crop = []

for aoi, scale in zip(areas_of_interest, scales):
    new_shape = np.round(np.array(img_array.shape) * scale).astype(np.int)
    resized_img = imresize(img_array, new_shape, interp='bilinear')

    scaled_aoi = np.round(aoi * np.tile(scale, reps=2)).astype(np.int)
    cropped_img = resized_img[
        scaled_aoi[0]:scaled_aoi[2],
        scaled_aoi[1]:scaled_aoi[3],
    ]

    resize_then_crop.append(cropped_img)

裁剪然后调整大小：

crop_then_resize = []
for aoi, scale in zip(areas_of_interest, scales):
    aoi = aoi.astype(np.int)

    cropped_img = img_array[
        aoi[0]:aoi[2],
        aoi[1]:aoi[3],
    ]

    new_shape = np.round(np.array(cropped_img.shape) * scale).astype(np.int)
    resized_img = imresize(cropped_img, new_shape, interp='bilinear')

    crop_then_resize.append(resized_img)

比较结果：

for img1, img2 in zip(resize_then_crop, crop_then_resize):
    nbimage(img1, mode='L')
    nbimage(img2, mode='L')
    print 'Shape of resize-then-crop:', img1.shape
    print 'Shape of crop-then-resize:', img2.shape
    print 'Are they equal?', np.array_equal(img1, img2)

Shape of resize-then-crop: (172, 308)
Shape of crop-then-resize: (171, 308)
Are they equal? False

Shape of resize-then-crop: (164, 360)
Shape of crop-then-resize: (163, 361)
Are they equal? False

使用Python

0 个答案: