Question

假设我有一个形状为(1,4,5)的numpy数组，

arr = np.array([[[ 0,  0,  0,  3,  0],
                [ 0,  0,  2,  3,  2],
                [ 0,  0,  0,  0,  0],
                [ 2,  1,  0,  0, 0]]])

我想在数组中找到特定轴上最频繁的非零值，并且只有在没有其他非零值时才返回零。

比方说，我在看axis = 2，我想从该数组中获得类似[[3,2,0,2]]的东西（对于最后一行，则1或2都可以）。是否有实现此目的的好方法？

我已经在以下问题（Link）中尝试过该解决方案，但不确定如何修改它以排除特定值。再次感谢！

Answer 1

我们可以使用numpy.apply_along_axis和一个简单的函数来解决这个问题。在这里，我们使用numpy.bincount来计算数值的出现次数，然后使用numpy.argmax来获得最高的出现次数。如果exclude之外没有其他值，我们将其返回。

代码：

def get_freq(array, exclude):
    count = np.bincount(array[array != exclude])
    if count.size == 0:
        return exclude
    else:  
        return np.argmax(count) 

np.apply_along_axis(lambda x: get_freq(x, 0), axis=2, arr=arr)

输出：

array([[3, 2, 0, 1]])

请注意，如果您传递一个空数组，它也会返回exclude。

编辑：正如Ehsan所指出的，上述解决方案不适用于给定数组中的负值。在这种情况下，请使用collections中的Counter：

arr = np.array([[[ 0,  -3,  0,  3,  0],
                 [ 0,  0,  2,  3,  2],
                 [ 0,  0,  0,  0,  0],
                 [ 2,  -5,  0,  -5, 0]]])

from collections import Counter

def get_freq(array, exclude):
    count = Counter(array[array != exclude]).most_common(1)
    if not count:
        return exclude
    else:  
        return count[0][0]

输出：

array([[-3,  2,  0, -5]])

most_common(1)返回Counter对象中出现最多的值作为一个带有元组的元素列表，其中第一个元素是值，第二个是其出现次数。它以列表形式返回，因此是双索引。如果列表为空，则most_common未发现任何事件（仅排除或为空）。

Answer 2

这是另一种解决方案（可能不如上述解决方案有效，但是是唯一的解决方案）-

{'errors': ['failed to initialize barrier: failed to persist keyring: mkdir /vault/vaultsecrets/core: permission denied']}

#Gets the positions for the highest frequency numbers in axis=2
count_max_pos = np.argmax(np.sum(np.eye(5)[arr][:,:,:,1:], axis=2), axis=2)[0]+1

#gets the max values in based on the indices
k = enumerate(count_max_pos)
result = [arr[0][i] for i in k]

print(result)

numpy：如何找到数组中最频繁的非零值？

2 个答案: