计算数组中的相同元素并创建字典

时间:2013-05-02 15:13:43

标签: python numpy count

这个问题可能太过正常,但我仍然无法弄清楚如何正确地做到这一点。

我有一个给定的数组[0,0,0,0,0,0,1,1,2,1,0,0,0,0,1,0,1,2,1,0,2,3](0-5中的任意元素),我希望有一个计数器来连续出现零。

1 times 6 zeros in a row
1 times 4 zeros in a row
2 times 1 zero  in a row

=> (2,0,0,1,0,1)

因此字典包含n*0值作为索引,计数器作为值。

最终数组包含500多万个未按上述方式排序的值。

3 个答案:

答案 0 :(得分:2)

这可以让你得到你想要的东西:

import numpy as np

a = [0,0,0,0,0,0,1,1,2,1,0,0,0,0,1,0,1,2,1,0,2,3]

# Find indexes of all zeroes
index_zeroes = np.where(np.array(a) == 0)[0]

# Find discontinuities in indexes, denoting separated groups of zeroes
# Note: Adding True at the end because otherwise the last zero is ignored
index_zeroes_disc = np.where(np.hstack((np.diff(index_zeroes) != 1, True)))[0]

# Count the number of zeroes in each group
# Note: Adding 0 at the start so first group of zeroes is counted
count_zeroes = np.diff(np.hstack((0, index_zeroes_disc + 1)))

# Count the number of groups with the same number of zeroes
groups_of_n_zeroes = {}
for count in count_zeroes:
    if groups_of_n_zeroes.has_key(count):
        groups_of_n_zeroes[count] += 1
    else:
        groups_of_n_zeroes[count] = 1

groups_of_n_zeroes持有:

{1: 2, 4: 1, 6: 1}

答案 1 :(得分:1)

与@ fgb类似,但对事件计数的处理更为简洁:

items = np.array([0,0,0,0,0,0,1,1,2,1,0,0,0,0,1,0,1,2,1,0,2,3])
group_end_idx = np.concatenate(([-1],
                                np.nonzero(np.diff(items == 0))[0],
                                [len(items)-1]))
group_len = np.diff(group_end_idx)
zero_lens = group_len[::2] if items[0] == 0 else group_len[1::2]
counts = np.bincount(zero_lens)

>>> counts[1:]
array([2, 0, 0, 1, 0, 1], dtype=int64)

答案 2 :(得分:0)

这看起来非常复杂,但我似乎找不到更好的东西:

>>> l = [0, 0, 0, 0, 0, 0, 1, 1, 2, 1, 0, 0, 0, 0, 1, 0, 1, 2, 1, 0, 2, 3]

>>> import itertools
>>> seq = [len(list(j)) for i, j in itertools.groupby(l) if i == 0]
>>> seq
[6, 4, 1, 1]

>>> import collections
>>> counter = collections.Counter(seq)
>>> [counter.get(i, 0) for i in xrange(1, max(counter) + 1)]
[2, 0, 0, 1, 0, 1]