Question

我有一个很大的2D ndarray浮点数，称之为ar。它包含一些NaN。我对右边的NaN的近邻感兴趣（例如沿着axis=1）。例如，如果我知道点（3,7）是NaN，我想选择ar[3, 8:8+N]。然后，我想重复NaN的所有位置，并vstack由此获得的所有切片。

我可以愉快地找到np.where的NaN，并对值进行for循环。可悲的是，这有点慢。是否有一种以矢量化方式进行索引的有效方法？所以我有一个元组列表(x, y)，我希望得到更多或更少，

result=np.vstack([ ar[x, y+1:y+1+N] for x, y, in tuples ])

没有循环。这可能吗？

非常感谢提前。

Answer 1

如果nan从边缘发生的列少于N列，那么您要求的内容是不明确的，但以下内容应该有效：

rows, cols = np.where(np.isnan(ar))
cols = (cols[:, None] + np.arange(1, N+1)).reshape(-1)
# Handle indices out of range by repeating the last column
cols = np.clip(cols, 0, ar.shape[1] - 1)
rows = np.repeat(rows, N)
result = ar[rows, cols].reshape(-1, 2)

制作一些假数据：

>>> ar = np.random.rand(25)
>>> ar[np.random.randint(25, size=5)] = np.nan
>>> ar = ar.reshape(5, 5)
>>> N = 2

并在其上运行上面的代码产生：

>>> ar
array([[ 0.96556647,         nan,  0.02934316,  0.82174232,  0.29293098],
       [ 0.34819313,  0.57449136,         nan,         nan,  0.32791866],
       [ 0.14020414,  0.60668458,  0.95613773,  0.09533064,  0.43401037],
       [ 0.83888255,  0.34240687,         nan,  0.02495232,  0.36234979],
       [ 0.21870906,  0.24181006,  0.81447603,  0.24216213,         nan]])
>>> result
array([[ 0.02934316,  0.82174232],
       [        nan,  0.32791866],
       [ 0.32791866,  0.32791866],
       [ 0.02495232,  0.36234979],
       [        nan,         nan]])

在NaN之后选择ndarray中的值

1 个答案: