我目前正在按照内部要求在Pyhton(3.6)中完成一个程序。作为其中的一部分,我必须遍历彩色图像(每个像素3个字节,R,G和B)并逐个像素扭曲图像。
我在其他语言(C ++,C#)中使用相同的代码,未优化的代码在大约两秒钟内执行,而优化的代码在不到一秒钟的时间内执行。非优化代码是指矩阵乘法由我实现的10行函数执行。优化版本仅使用外部库进行乘法。
在Python中,这段代码耗时近300秒。我想不出一种矢量化此逻辑或加快其速度的方法,因为在嵌套循环中有几个“ if”。任何帮助将不胜感激。
import numpy as np
#for test purposes:
#roi = rect.rect(0, 0, 1200, 1200)
#input = DCImage.DCImage(1200, 1200, 3)
#correctionImage = DCImage.DCImage(1200,1200,3)
#siteToImage= np.zeros((3,3), np.float32)
#worldToSite= np.zeros ((4, 4))
#r11 = r12 = r13 = r21 = r22 = r23 = r31 = r32 = r33 = 0.0
#xMean = yMean = zMean = 0
#tx = ty = tz = 0
#epsilon = np.finfo(float).eps
#fx = fy = cx = cy = k1 = k2 = p1 = p2 = 0
for i in range (roi.x, roi.x + roi.width):
for j in range (roi.y , roi.y + roi.height):
if ( (input.pixels [i] [j] == [255, 0, 0]).all()):
#Coordinates conversion
siteMat = np.matmul(siteToImage, [i, j, 1])
world =np.matmul(worldToSite, [siteMat[0], siteMat[1], 0.0, 1.0])
xLocal = world[0] - xMean
yLocal = world[1] - yMean
zLocal = z_ortho - zMean
#From World to camera
xCam = r11*xLocal + r12*yLocal + r13*zLocal + tx
yCam = r21*xLocal + r22*yLocal + r23*zLocal + ty
zCam = r31*xLocal + r32*yLocal + r33*zLocal + tz
if (zCam > epsilon or zCam < -epsilon):
xCam = xCam / zCam
yCam = yCam / zCam
#// DISTORTIONS
r2 = xCam*xCam + yCam*yCam
a1 = 2*xCam*yCam
a2 = r2 + 2*xCam*xCam
a3 = r2 + 2*yCam*yCam
cdist = 1 + k1*r2 + k2*r2*r2
u = int((xCam * cdist + p1 * a1 + p2 * a2) * fx + cx + 0.5)
v = int((yCam * cdist + p1 * a3 + p2 * a1) * fy + cy + 0.5)
if (u>=0 and u<correctionImage.width and v>=0 and v < correctionImage.height):
input.pixels [i] [j] = correctionImage.pixels [u][v]
答案 0 :(得分:1)
通常,您通过制作置换图来矢量化这种事情。
制作一个复杂的图像,其中每个像素都有其自己的坐标值,应用常规的数学运算来计算所需的任何变换,然后将贴图应用于源图像。
例如,您可以在pyvips中输入:
import sys
import pyvips
image = pyvips.Image.new_from_file(sys.argv[1])
# this makes an image where pixel (0, 0) (at the top-left) has value [0, 0],
# and pixel (image.width, image.height) at the bottom-right has value
# [image.width, image.height]
index = pyvips.Image.xyz(image.width, image.height)
# make a version with (0, 0) at the centre, negative values up and left,
# positive down and right
centre = index - [image.width / 2, image.height / 2]
# to polar space, so each pixel is now distance and angle in degrees
polar = centre.polar()
# scale sin(distance) by 1/distance to make a wavey pattern
d = 10000 * (polar[0] * 3).sin() / (1 + polar[0])
# and back to rectangular coordinates again to make a set of vectors we can
# apply to the original index image
distort = index + d.bandjoin(polar[1]).rect()
# distort the image
distorted = image.mapim(distort)
# pick pixels from either the distorted image or the original, depending on some
# condition
result = (d.abs() > 10 or image[2] > 100).ifthenelse(distorted, image)
result.write_to_file(sys.argv[2])
这只是一个愚蠢的摆动模式,但是您可以将其交换为所需的任何失真。然后运行为:
$ /usr/bin/time -f %M:%e ./wobble.py ~/pics/horse1920x1080.jpg x.jpg
54572:0.31
在这台2015年两核笔记本电脑上实现300ms和55MB的内存,可以做到:
答案 1 :(得分:0)
经过大量测试后,无需使用C ++编写即可加速功能的唯一方法是分解并向量化它。在此特定实例中执行此操作的方法是,在函数的开头创建一个带有有效索引的数组,并将它们用作元组来为最终解决方案建立索引。
subArray[roi.y:roi.y+roi.height,roi.x:roi.x+roi.width,] = input.pixels[roi.y:roi.y+roi.height,roi.x:roi.x+roi.width,]
#Calculate valid XY indexes
y_index, x_index = np.where(np.all(subArray== np.array([255,0,0]), axis=-1))
#....
#do stuff
#....
#Join result values with XY indexes
ij_xy = np.column_stack((i, j, y_index, x_index))
#Only keep valid ij values
valids_ij_xy = ij_xy [(ij_xy [:,0] >= 0) & (ij_xy [:,0] < correctionImage.height) & (ij_xy [:,1] >= 0) & (ij_xy [:,1] < correctionImage.width)]
#Assign values
input.pixels [tuple(np.array(valids_ij_xy [:,2:]).T)] = correctionImage.pixels[tuple(np.array(valids_ij_xy [:,:2]).T)]