Question

我有一个 Numpy 3d 数组，它只是一个灰色图像列表：

images = np.zeros((xlen, height, width), dtype=int)
for i in range (5):
   images[i] = cv2.imread(filename[i], cv2.IMREAD_GRAYSCALE)

所有图像都非常相同，但它们都有一些随机噪声像素。我的想法是，与其他图像中的相同像素相比，噪声像素是最大值或最小值。

所以我需要：

找出每个像素的最小值和最大值
计算没有最大值和最小值的所有图像之间每个像素的平均值
用计算的平均值替换所有最小值和最大值

我使用标准 python 函数以一种幼稚的方式实现了它，但速度太慢了：

   #remove highest and lowest values for each pixel
   for el in range (height):
      for em in range (width):
         mylist = []
         for j in range (0, xlen):
            mylist.append(images[j][el][em])
         indmin = mylist.index(min(mylist))
         indmax = mylist.index(max(mylist))
         temp_counterx=0
         temp_sum = 0
         for j in range (0, xlen):
            if (j!=indmin) and (j!=indmax):
               temp_counterx +=1
               temp_sum += mylist[j]
         temp_val = int(temp_sum/temp_counterx)
         images[indmin][el][em]=temp_val
         images[indmax][el][em]=temp_val

是否可以使用 Numpy 加快速度？

更新：接受了缺陷者提出的解决方案，并进行了一些小改动：

   mins = np.min(images, axis=0)
   maxs = np.max(images, axis=0)
   sums = np.sum(images, axis=0)
   # compute the mean without the extremes
   mean_without_extremes = (sums - mins - maxs) / (xlen - 2)
   mean_without_extremes = mean_without_extremes.astype(int)

   # replace maxima with the mean
   images = np.where((mins==images), images, mean_without_extremes)
   images = np.where((maxs==images), images, mean_without_extremes)

...速度提高了 30 倍！看来 numpy 提供了非常快速和强大的计算引擎，但由于它处理的复杂数据结构，有时使用起来可能会很棘手。

Answer 1

首先，要计算 mean 之类的东西，您可能希望使用浮点数而不是整数来开始宽度。所以在下面我假设你使用它们。

通过使用 python 循环，你放弃了 numpy 的所有优点，因为它们本质上很慢，至少与调用 numpy 函数时执行的底层编译代码相比。如果您希望您的代码相当快，您应该使用矢量化。考虑以下代码，它可以满足您的要求，但在 python 中没有任何循环：

# compute minima, maxima and sum
mins = np.min(images, axis=0)
maxs = np.max(images, axis=0)
sums = np.sum(images, axis=0)
# compute the mean without the extremes
mean_without_extremes = (sums - mins - maxs) / (xlen - 2)
# replace maxima with the mean
images[images == mins] = mean_without_extremes.reshape(-1)
images[images == maxs] = mean_without_extremes.reshape(-1)

由于您可能对此不熟悉，我建议您阅读有关索引和广播的文档中的介绍，以便有效地使用 numpy：

编辑：正如评论中所指出的，上面的解决方案仅适用于 xlen > 2 并且如果每个像素位置只获得一次极值。这可以通过用

替换这些行来解决

images = np.where(images == mins, images, mean_without_extremes)
images[np.isnan(images)] = 0  # set "empty mean" to zero
# using "np.where" as suggested by OP
# we can actually reduce that to one "np.where" call which might be slightly faster
images = np.where(np.logical_or(images == mins, images == maxs), images, mean_without_extremes)

Answer 2

确保您使用的所有内容都是 numpy array 和 NOT Python list，并确保所有成员都具有相同的数据类型。在你的情况下，这是真的。

现在您可以使用名为 numba 的库。它使用 JIT。

可以看到演示它的视频here。

可以查看 numba 的文档here

Numpy 3D 数组最大值和最小值

2 个答案: