Question

我有一个由零，一和NaN组成的numpy ndarray。我想在该数组上使用多数过滤器，这意味着我想设置一个内核窗口（例如3X3单元格），该窗口将遍历该数组并将中心单元格的值更改为出现的值在其邻居中最多。此过滤器应承受两个约束，应忽略NaN，并且如果中心单元格的值为1，则应将其保持为1。

这里是我要寻找的东西的一个小例子：输入数组：

array([[ 1.,  1.,  1.,  0.,  0.],
       [ 1.,  1., nan,  1.,  1.],
       [nan,  1.,  1.,  0.,  1.],
       [ 0.,  0.,  0.,  0.,  1.]])

应用多数过滤器输出数组：

array([[ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1., nan,  1.,  1.],
       [nan,  1.,  1.,  1.,  1.],
       [ 0.,  0.,  0.,  1.,  1.]])

我当时在看scipy filters，但找不到足够的东西。我本来打算构建一个generic convolved filter，但是我不确定如何出于多数目的。感觉这是应该退出的基本过滤器，但我似乎找不到它。

Answer 1

这是一个基于convolution的矢量化想法。考虑到这些限制，似乎我们只需要编辑0s位置。对于每个滑动窗口，先获得1s的计数，然后获得非NaN的计数，这将决定用于确定1s是否占多数的阈值。如果是，请将那些也为0的位置设置为1s。

实现看起来像这样-

from scipy.signal import convolve2d

def fill0s(a):
    # Mask of NaNs
    nan_mask = np.isnan(a)

    # Convolution kernel
    k = np.ones((3,3),dtype=int)

    # Get count of 1s for each kernel window
    ones_count = convolve2d(np.where(nan_mask,0,a),k,'same')

    # Get count of elements per window and hence non NaNs count
    n_elem = convolve2d(np.ones(a.shape,dtype=int),k,'same')
    nonNaNs_count = n_elem - convolve2d(nan_mask,k,'same')

    # Compare 1s count against half of nonNaNs_count for the first mask.
    # This tells us if 1s are majority among non-NaNs population.
    # Second mask would be of 0s in a. Use Combined mask to set 1s.
    final_mask = (ones_count >= nonNaNs_count/2.0) & (a==0)
    return np.where(final_mask,1,a)

请注意，由于我们正在使用这种1s内核执行统一过滤，因此我们也可以使用uniform_filter。

样品运行-

In [232]: a
Out[232]: 
array([[ 1.,  1.,  1.,  0.,  0.],
       [ 1.,  1., nan,  1.,  1.],
       [nan,  1.,  1.,  0.,  1.],
       [ 0.,  0.,  0.,  0.,  1.]])

In [233]: fill0s(a)
Out[233]: 
array([[ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1., nan,  1.,  1.],
       [nan,  1.,  1.,  1.,  1.],
       [ 0.,  0.,  0.,  1.,  1.]])

Answer 2

尝试以下代码：

请注意，由于多个索引具有相同的最大值时numpy.argmax的行为，结果与您的结果有些不同（您可能要编写自己的argmax函数... x = np.argwhere（x == np.max（x））[:, 0]给出所有索引，而不仅仅是第一个）

import numpy as np

def block_fn(x,center_val):

    unique_elements, counts_elements = np.unique(x.ravel(), return_counts=True)

    if np.isnan(center_val):
        return np.nan
    elif center_val == 1:
        return 1.0
    else:
        return unique_elements[np.argmax(counts_elements)]



def majority_filter(x,block_size = (3,3)):

    #Odd block sizes only  ( ? )
    assert(block_size[0]%2 != 0 and block_size[1]%2 !=0)

    yy =int((block_size[0]-1)/2)
    xx =int((block_size[1]-1)/2)


    output= np.zeros_like(x)
    for i in range(0,x.shape[0]):
        miny,maxy = max(0,i-yy),min(x.shape[0]-1,i+yy)

        for j in range(0,x.shape[1]):
            minx,maxx = max(0,j-xx),min(x.shape[1]-1,j+xx)

            #Extract block to take majority filter over
            block=x[miny:maxy+1,minx:maxx+1]

            output[i,j] = block_fn(block,center_val=x[i,j])


    return output


inp=np.array([[ 1.,  1.,  1.,  0.,  0.],
       [ 1.,  1., np.nan,  1.,  1.],
       [np.nan,  1.,  1.,  0.,  1.],
       [ 0.,  0.,  0.,  0.,  1.]])


print(majority_filter(inp))

多数过滤器Numpy数组

2 个答案: