NumPy 2D阵列迭代速度

时间:2018-05-09 13:14:45

标签: python arrays python-imaging-library

我有一个循环,用PIL的像素信息填充2-D NumPy数组,这个数组称为“阴影”。颜色为白色或蓝色。我想建立一个白色占主导地位的最终图像。   即如果循环中的一个图像在坐标x,y上有蓝色像素,并且循环中的另一个图像在同一坐标上有一个白色像素,则最终像素将为白色。

目前通过以下方式完成:

import math, random, copy
import numpy as np
from PIL import Image, ImageDraw

colours = {0: (255,255,255), 1: (0,0,255)}

#width and height of area of interest
w = 100 #100 meter
h = 200 #200 meter

NumberOfDots = 10
DotRadius = 20
NumberOfRuns = 3

Final = np.array([[colours[0] for x in range(w)] for y in range(h)])
Shadows = np.array([[colours[0] for x in range(w)] for y in range(h)])

for SensorNum in range(NumberOfRuns):

  Shadows = np.array([[colours[0] for x in range(w)] for y in range(h)])

  for dot in range(NumberOfDots):

    ypos = random.randint(DotRadius, h-DotRadius)
    xpos = random.randint(DotRadius, w-DotRadius)

    for i in range(xpos - DotRadius, xpos + DotRadius):
      for j in range(ypos - DotRadius, ypos + DotRadius):
          if math.sqrt((xpos - i)**2 + (ypos - j)**2) < DotRadius:
            Shadows[j][i] = colours[1]

  im = Image.fromarray(Shadows.astype('uint8')).convert('RGBA')
  im.save('result_test_image'+str(SensorNum)+'.png')

  #This for loop below is the bottle-neck. Can its speed be improved?
  if SensorNum > 0:
    for i in range(w):
      for j in range(h):
        #White space dominates.
        #(pixel by pixel) If the current images pixel is white and the unfinshed Final
        #images pixel is blue then set the final pixel to white.
        if np.all(Shadows[j][i]==colours[0]) and np.all(Final[j][i]==colours[1]):
          Final[j][i] = colours[0]
  else:
    Final = copy.deepcopy(Shadows)

im = Image.fromarray(Final.astype('uint8')).convert('RGBA')
im.save('result_final_test.png')

最后嵌套的for循环是我有兴趣改进的。 这很好但迭代是一个巨大的瓶颈。无论如何通过使用一些矢量等来更快地使用它?

1 个答案:

答案 0 :(得分:1)

当然,可以在代码中对最后一个for循环进行矢量化,因为每次迭代都不依赖于之前迭代中计算的值。 但老实说,这并不像我想象的那么容易......

我的方法比你当前的循环快约800到1000倍。我使用下划线用小写名称替换了大写数组和变量名。大写通常是为python中的类保留的。这就是你问题中出现奇怪代码着色的原因。

if sensor_num > 0:
    mask = (  # create a mask where the condition is True
        ((shadows[:, :, 0] == 255) &  # R=255
         (shadows[:, :, 1] == 255) &  # G=255
         (shadows[:, :, 2] == 255)) &  # B=255
        ((final[:, :, 0] == 0) &  # R=0
         (final[:, :, 1] == 0) &  # G=0
         (final[:, :, 2] == 255)))  # B=255
    final[mask] = np.array([255, 255, 255])  # set Final to white where mask is True
else:
    final = copy.deepcopy(shadows)

RGB值当然可以替换为查找预定义值,例如colours dict。但我建议使用数组来存储颜色,特别是如果你打算用数字对它进行索引:

colours = np.array([[255, 255, 255], [0, 0, 255]])

这样掩码看起来像:

mask = (  # create a mask where the condition is True
    ((shadows[:, :, 0] == colours[0, 0]) &  # R=255
     (shadows[:, :, 1] == colours[0, 1]) &  # G=255
     (shadows[:, :, 2] == colours[0, 2])) &  # B=255
    ((final[:, :, 0] == colours[1, 0]) &  # R=0
     (final[:, :, 1] == colours[1, 1]) &  # G=0
     (final[:, :, 2] == colours[1, 2])))  # B=255
final[mask] = colours[0]  # set Final to white where mask is True

当然,这也适用于dict

为了进一步提高速度,您可以将掩码中的RGC比较替换为与阵列本身的比较(模板计算)。对于您的阵列大小,这大约快5%,速度差异随着阵列大小的增加而增加,但是只需更改colours数组/字典中的条目,就会失去比较其他颜色的灵活性。 带有模板操作的掩码如下所示:

mask = (  # create a mask where the condition is True
    ((shadows[:, :, 0] == shadows[:, :, 1]) &  # R=G
     (shadows[:, :, 1] == shadows[:, :, 2]) &  # G=B
     (shadows[:, :, 2] == colours[0, 2])) &  # R=G=B=255
    ((final[:, :, 0] == final[:, :, 1]) &  # R=G
     (final[:, :, 1] == colours[1, 1]) &  # G=0
     (final[:, :, 2] == colours[1, 2])))  # B=255

这应该有助于大幅加快计算速度。

其他代码的部分内容也可以进行优化。但当然这只是值得的,如果这不是瓶颈。 举一个例子:你可以调用它一次并创建一个随机数组(以及+ - DotRadius数组),然后遍历这个数组,而不是调用random.randint每个循环:

ypos = np.random.randint(DotRadius, h-DotRadius, size=NumberOfDots)
ypos_plus_dot_radius = ypos + DotRadius
ypos_minus_dot_radius = ypos - DotRadius
xpos = np.random.randint(DotRadius, w-DotRadius, size=NumberOfDots)
xpos_plus_dot_radius = xpos + DotRadius
xpos_minus_dot_radius = xpos - DotRadius
for dot in range(NumberOfDots):
    yrange = np.arange(ypos_minus_dot_radius[dot], ypos_plus_dot_radius[dot])  # make range instead of looping
    # looping over xrange imho can't be avoided without further matrix operations
    for i in range(xpos_minus_dot_radius[dot], xpos_plus_dot_radius[dot]):
        # make a mask for the y-positions where the condition is true and
        # index the y-axis of Shadows with this mask:
        Shadows[yrange[np.sqrt((xpos[dot] - i)**2 + (ypos[dot] - yrange)**2) < DotRadius], i] = colours[1]
        # colours[1] can of course be replaced with any 3-element array or single integer/float