在Python数组中查找最低的相邻索引

时间:2013-03-21 11:33:37

标签: python numpy

我有一个问题,我们应该编写一个函数,当给定2D数组的输入时,它将返回每个索引的最低值的相邻索引的行和列中的偏移量;一个数组用于行中每个索引的偏移量,一个数组用于列中的偏移量。例如,如果索引的最低相邻单元格向下一行并向右一列,则偏移量为1,1;如果最低相邻单元格在左边,则偏移量为0,-1;如果它是相邻单元格中的最低单元格,则偏移量为0,0。

因为我找不到更快更正确的方法,所以我编写了一个while循环,它将迭代每个索引,看看哪个点[i,j]的周围索引低于所有的使用a.all()的其他周围索引:

def findLowNhbr( terrain ):
    """Creates two 2D-arrays the shape of terrain consisting
    of the offsets (row and column) to the neighbor with the minimum eleveation"""
    rowOffset = np.zeros_like(terrain)
    colOffset = np.zeros_like(terrain)

for i in range(len(terrain)):
    if i == 0:
        rowOffset[i] = 0
        colOffset[i] = 0
    elif i == (len(terrain)-1):
        rowOffset[i] = 0
        colOffset[i] = 0
    else:
        for j in range(len(terrain[i])):
            if j == 0 or j == len(terrain[i])-1:
                rowOffset[:,j] = 0
                colOffset[:,j] = 0
            elif (terrain[i-1:i+2,j-1:j+2]>=terrain[i-1,j-1]).all():
                rowOffset[i,j] = -1
                colOffset[i,j] = -1
            elif (terrain[i-1:i+2,j-1:j+2]>=terrain[i,j-1]).all():
                rowOffset[i,j] = 0
                colOffset[i,j] = -1
            elif (terrain[i-1:i+2,j-1:j+2]>=terrain[i+1,j-1]).all():
                rowOffset[i,j] = 1
                colOffset[i,j] = -1
            elif (terrain[i-1:i+2,j-1:j+2]>=terrain[i-1,j]).all():
                rowOffset[i,j] = -1
                colOffset[i,j] = 0
            elif (terrain[i-1:i+2,j-1:j+2]>=terrain[i+1,j]).all():
                rowOffset[i,j] = 1
                colOffset[i,j] = 0
            elif (terrain[i-1:i+2,j-1:j+2]>=terrain[i-1,j+1]).all():
                rowOffset[i,j] = -1
                colOffset[i,j] = 1
            elif (terrain[i-1:i+2,j-1:j+2]>=terrain[i,j]).all():
                rowOffset[i,j] = 0
                colOffset[i,j] = 1
            elif (terrain[i-1:i+2,j-1:j+2]>=terrain[i+1,j+1]).all():
                rowOffset[i,j] = 1
                colOffset[i,j] = 1
            else:
                rowOffset[i,j] = 0
                colOffset[i,j] = 0
return rowOffset, colOffset

运行需要很长时间,但确实会运行。我无法想象我实际上是以最有效的方式做到这一点;任何输入?

2 个答案:

答案 0 :(得分:2)

这应该或多或少地以矢量化方式进行,忽略边界上的一些问题,你可以通过用边缘重复的值填充输入数组并修剪输出来避免这些问题

import numpy as np

np.random.seed(0)
terrain = np.random.rand(10,10)

offsets = [(i,j) for i in range(-1,2) for j in range(-1,2)]

stacked = np.dstack( np.roll(np.roll(terrain,i,axis=0),j,axis=1) for i, j in offsets)

offset_index = np.argmin(stacked,axis=2)
output = np.array(offsets)[offset_index]

<强>解释

  • 将所有偏移量堆叠为NxMx9数组
  • 找到沿最后一个轴axis=2
  • 的最小元素(argmin)的索引
  • 我们通过使用结果来索引最后一行中的偏移来将此索引转换为偏移向量数组

获得所有初始偏移的另一种可能更简洁的方法是:

from itertools import product
offsets = list(product((-1, 0, 1), (-1, 0, 1)))

答案 1 :(得分:1)

我喜欢E先生在单个维度中堆叠所有周围值的基本思想,但我认为有更好的方法来创建堆叠数组并将np.argmin的返回转换为索引对:

from numpy.lib.stride_tricks import as_strided

rows, cols = 100, 100
win_rows, win_cols = 3, 3 # these two should be odd numbers
terrain = np.random.rand(rows, cols)

# This takes a windowed view of the original array, no data copied
win_terrain = as_strided(terrain, shape=(rows-win_rows+1, cols-win_cols+1,
                                         win_rows, win_cols),
                         strides=terrain.strides*2)
# This makes a copy of the stacked array that will take up x9 times more memory
# than the original one
win_terrain = win_terrain.reshape(win_terrain.shape[:2] + (-1,))

indices = np.argmax(win_terrain, axis=-1)
offset_rows, offset_cols = np.unravel_index(indices,
                                            dims=(win_rows, win_cols))
# For some odd reason these arrays are non-writeable, so -= won't work
offset_rows = offset_rows - win_rows//2
offset_cols = offset_cols - win_cols//2

结果数组只有(98, 98),即缺少第一列和最后一列和行,因为它们周围没有完全定义的窗口。