Question

具有这两个数组：

import numpy as np

arr1 = np.array([[0,0,0,0,1,0],
                 [0,0,0,0,0,0],
                 [0,0,0,0,0,0],
                 [0,1,0,0,1,0],
                 [0,0,0,0,0,0]], dtype=bool)

arr2 = np.array([[0,1,1,0,0,0],
                 [0,1,1,0,0,0],
                 [0,0,0,0,1,1],
                 [1,1,0,0,1,1],
                 [1,1,0,0,0,0]], dtype=bool)

我需要一种逻辑运算，将 arr1 拦截的 arr2 中的所有已连接功能返回为True。结果应该是这样的：

arr3 = np.array([[0,0,0,0,0,0],
                 [0,0,0,0,0,0],
                 [0,0,0,0,1,1],
                 [1,1,0,0,1,1],
                 [1,1,0,0,0,0]], dtype=bool)

我检查了python和numpy logic functions中的逻辑运算，但似乎没有任何效果。有任何想法吗？谢谢：）

Answer 1

方法1

我们可以使用基于图像处理的标记功能来基于连接度标记图像，然后使用遮罩获得所需的输出。要进行标记，我们可以使用skimage.measure.label。或者，我们也可以使用scipy.ndimage.label来获取带标签的图像。因此，一种解决方案是-

from skimage.measure import label as sklabel

def func1(arr1,arr2):
    # Get labeled image
    L = sklabel(arr2)

    # Get all present labels (this would also include zero label)
    present_labels = L[arr1]

    # Get presence of these labels in the labeled image. Remove the zero regions
    # by AND-ing with arr2.
    return np.isin(L,present_labels) & arr2

给定样本的输出-

In [141]: func1(arr1,arr2)
Out[141]: 
array([[False, False, False, False, False, False],
       [False, False, False, False, False, False],
       [False, False, False, False,  True,  True],
       [ True,  True, False, False,  True,  True],
       [ True,  True, False, False, False, False]])

方法2

对于大斑点，我们应该在使用np.isin之前先获得唯一的当前标签，以提高性能。为了有效地获得那些独特的特征，我们可以使用pandas.factorize。标签部分也可以通过使用SciPy版本来提高性能。因此，对于这种情况，更有效的解决方案是-

from scipy.ndimage import label as splabel
import pandas as pd

def func2(arr1,arr2):
    L = splabel(arr2)[0]
    pL = pd.factorize(L[arr1])[1]
    return np.isin(L,pL[pL!=0])

基准化

我们将使用给定的样本数据，在行和列上按1000x进行缩放，同时保持相同数量的斑点。为了扩大规模，我们将使用np.repeat-

In [147]: arr1 = arr1.repeat(1000, axis=0).repeat(1000, axis=1)

In [148]: arr2 = arr2.repeat(1000, axis=0).repeat(1000, axis=1)

In [149]: %timeit func1(arr1,arr2)
     ...: %timeit func2(arr1,arr2)
1.75 s ± 7.01 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
226 ms ± 5.99 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Answer 2

如果您不希望使用除Numpy之外的任何其他库，我建议对矩阵进行泛洪填充，这可以通过DFS或BFS来完成。主要思想是迭代arr1的每个元素，如果该元素为true，则将该位置添加到堆栈/队列中。然后，对于每个位置，您都将在arr2上开始填充，对于每个生成的新位置，您必须在arr3中写入1。该代码将是这样的：

from collections import deque
#arr3 start empty
arr3 = np.zeros((5,6))
rows = 5
cols = 6
#here i will use a deque as a queue
q = deque()
for i in range(len(arr1)):
    for j in range(len(arr1[i])):
        if arr1[i][j]:
            #for every true element in arr1
            #i add it to the queue
            q.append((i,j))
#bfs transversal
while q:
    i,j = q.popleft()
    if not arr3[i][j] and arr2[i][j]:
        #check on arr3 to avoid infinite looping
        #check on arr2 to see if it's part of a component
        arr3[i][j] = True
        #here i assume diagonals are neighbors too
        #but you can change these fors to define neighbors
        for ii in range(i-1,i+2):
            for jj in range(j-1,j+2):
                if ii==i and jj==j:
                    #we dont want to check the same element again
                    continue
                if ii<0 or ii>=rows or jj<0 or jj>=cols:
                    #we dont want to access an invalid position either
                    continue
                q.append((ii,jj))
print(arr3)

基于连通邻域的布尔交集-NumPy / Python

2 个答案:

基准化