在我的计算机视觉应用程序中,我一直按以下顺序进行图像操作:
当每个图像中存在多个感兴趣区域(AoI)时,效率非常低,每个AoI需要以不同的比例调整大小。
E.g:
from PIL import Image
import numpy as np
from scipy.misc import imresize
img_array = np.array(Image.open('foo.png').convert('L'))
# [top, left, bottom, right]
areas_of_interest = np.array([
[10.0, 14.0, 55.0, 70.0],
[81.0, 33.0, 170.0, 88.0],
])
# [vertical scale, horizontal scale]
scales = np.array([
[1.5, 2.3],
[1.2, 1.9],
])
for aoi, scale in zip(areas_of_interest, scales):
new_shape = np.array(img_array.shape) * scale
resized_img = imresize(
img_array,
new_shape,
interp='bilinear',
)
scaled_aoi = aoi * np.tile(scale, reps=2)
cropped_img = resized_img[
scaled_aoi[0]:scaled_aoi[2],
scaled_aoi[1]:scaled_aoi[3],
]
do_something_with_img(cropped_img)
有没有办法可以反转裁剪的顺序并调整操作,同时保持生成的图像的字节完美标识?
如果我只是简单地裁剪然后调整大小,结果图像与旧方法不同,原因有两个:
结果是我得到的图像看起来与原始方法非常相似,但尺寸不一定相同,并且由于像素别名,像素值也不同。
我相信,在理论上,应该有可能得到与原始方法相同的结果,因为双线性插值的工作方式。
我认为我需要三个解决方案的成分:
数字1和3应该很容易,但我不知道任何为我做2号的库,我真的不想在numpy或cython中滚动我自己的双线性图像缩放。
有没有人知道如何在不跳过太多篮球的情况下获得第二名?我可以访问PIL,通常的numpy / scipy堆栈和OpenCV(cv2)。
编辑:添加一个具体的例子,在Jupyter笔记本中运行它
from PIL import Image
import numpy as np
from scipy.misc import imresize
def nbimage(data, mode):
'''Display raw data as a notebook inline image.
Parameters:
data: array-like object, two or three dimensions. If three dimensional,
first or last dimension must have length 3 or 4 and will be
interpreted as color (RGB or RGBA).
mode: the PIL image mode
'''
from IPython.display import display, Image as IpyImage
from PIL.Image import fromarray
from StringIO import StringIO
s = StringIO()
fromarray(data, mode=mode).save(s, 'png')
display(IpyImage(s.getvalue()))
# This is http://www.pdphoto.org/PictureDetail.php?mat=pdef&pg=7917
img =Image.open('zoo_1_bg_012604.jpg').convert('L')
img_array = np.array(img)
nbimage(img_array, mode='L')
定义感兴趣的区域和尺度:
areas_of_interest = np.array([
[155, 623, 155 + 114, 623 + 134],
[567, 605, 136 + 567, 605 + 190],
], dtype=np.float)
scales = np.array([
[1.5, 2.3],
[1.2, 1.9],
])
调整大小然后裁剪:
resize_then_crop = []
for aoi, scale in zip(areas_of_interest, scales):
new_shape = np.round(np.array(img_array.shape) * scale).astype(np.int)
resized_img = imresize(img_array, new_shape, interp='bilinear')
scaled_aoi = np.round(aoi * np.tile(scale, reps=2)).astype(np.int)
cropped_img = resized_img[
scaled_aoi[0]:scaled_aoi[2],
scaled_aoi[1]:scaled_aoi[3],
]
resize_then_crop.append(cropped_img)
裁剪然后调整大小:
crop_then_resize = []
for aoi, scale in zip(areas_of_interest, scales):
aoi = aoi.astype(np.int)
cropped_img = img_array[
aoi[0]:aoi[2],
aoi[1]:aoi[3],
]
new_shape = np.round(np.array(cropped_img.shape) * scale).astype(np.int)
resized_img = imresize(cropped_img, new_shape, interp='bilinear')
crop_then_resize.append(resized_img)
比较结果:
for img1, img2 in zip(resize_then_crop, crop_then_resize):
nbimage(img1, mode='L')
nbimage(img2, mode='L')
print 'Shape of resize-then-crop:', img1.shape
print 'Shape of crop-then-resize:', img2.shape
print 'Are they equal?', np.array_equal(img1, img2)
Shape of resize-then-crop: (172, 308) Shape of crop-then-resize: (171, 308) Are they equal? False
Shape of resize-then-crop: (164, 360) Shape of crop-then-resize: (163, 361) Are they equal? False