如何逐列获得重复数字的长度?

时间:2016-09-01 07:24:58

标签: python numpy

我想在Python Numpy中获取重复数字的长度。例如,让我们考虑一个简单的ndarray

import numpy as np

a = np.array([
    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    [1, 0, 1, 0, 0, 1, 1, 1, 0, 1],
    [0, 1, 0, 1, 0, 1, 0, 0, 1, 0],
    [1, 1, 0, 0, 1, 1, 1, 1, 0, 0],
])

第一列有[0, 1, 0, 1]1的位置为1,现在从那里开始计算,我们得到ones = 2zeros = 1。因此,当遇到ones(起始位置)时,我必须开始计算zeros1

所以a的答案是

ones = [2, 2, 1, 1, 1, 3, 2, 2, 1, 1]
zeros = [1, 0, 2, 1, 0, 0, 1, 1, 1, 2]

任何人都可以帮帮我吗?

更新

3D阵列:

a = np.array([
    [
        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
        [1, 1, 0, 0, 0, 1, 1, 1, 0, 0],
        [0, 1, 0, 0, 0, 1, 0, 0, 1, 0],
        [1, 1, 0, 0, 1, 1, 1, 1, 0, 0],
    ],
    [
        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
        [0, 1, 0, 0, 1, 0, 0, 0, 1, 1],
        [0, 1, 0, 1, 0, 0, 0, 1, 0, 0],
        [1, 1, 0, 1, 0, 1, 1, 1, 0, 0],
    ]
])

预期输出应为

ones = [
         [2, 3, 0, 0, 1, 3, 2, 2, 1, 0],
         [1, 3, 0, 2, 1, 1, 1, 2, 1, 1]
       ]
zeros = [
          [1, 0, 0, 0, 0, 0, 1, 1, 1, 0],
          [0, 0, 0, 0, 2, 0, 0, 0, 2, 2]
        ]

1 个答案:

答案 0 :(得分:4)

关注性能,这是ndarrays的一种通用方法 -

ones_count = a.sum(-2)
zeros_count = (a.shape[-2] - ones_count - a.argmax(-2))*a.any(-2)

使用zeros_count选择获取np.where的另一种方法是 -

zeros_count = np.where(a.any(-2),a.shape[-2] - ones_count - a.argmax(-2),0)

示例运行

2D案例:

In [60]: a
Out[60]: 
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [1, 0, 1, 0, 0, 1, 1, 1, 0, 1],
       [0, 1, 0, 1, 0, 1, 0, 0, 1, 0],
       [1, 1, 0, 0, 1, 1, 1, 1, 0, 0]])

In [61]: ones_count = a.sum(-2)
    ...: zeros_count = (a.shape[-2] - ones_count - a.argmax(-2))*a.any(-2)
    ...: 

In [62]: ones_count
Out[62]: array([2, 2, 1, 1, 1, 3, 2, 2, 1, 1])

In [63]: zeros_count
Out[63]: array([1, 0, 2, 1, 0, 0, 1, 1, 1, 2])

3D案例:

In [65]: a = np.array([
    ...:     [
    ...:         [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    ...:         [1, 1, 0, 0, 0, 1, 1, 1, 0, 0],
    ...:         [0, 1, 0, 0, 0, 1, 0, 0, 1, 0],
    ...:         [1, 1, 0, 0, 1, 1, 1, 1, 0, 0],
    ...:     ],
    ...:     [
    ...:         [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    ...:         [0, 1, 0, 0, 1, 0, 0, 0, 1, 1],
    ...:         [0, 1, 0, 1, 0, 0, 0, 1, 0, 0],
    ...:         [1, 1, 0, 1, 0, 1, 1, 1, 0, 0],
    ...:     ]
    ...: ])

In [66]: ones_count = a.sum(-2)
    ...: zeros_count = (a.shape[-2] - ones_count - a.argmax(-2))*a.any(-2)
    ...: 

In [67]: ones_count
Out[67]: 
array([[2, 3, 0, 0, 1, 3, 2, 2, 1, 0],
       [1, 3, 0, 2, 1, 1, 1, 2, 1, 1]])

In [68]: zeros_count
Out[68]: 
array([[1, 0, 0, 0, 0, 0, 1, 1, 1, 0],
       [0, 0, 0, 0, 2, 0, 0, 0, 2, 2]])

等等,用于更高的暗淡阵列。