Question

好吧，我觉得这很简单，但我的numpy-fu还不够强大。我有一个整数阵列A;它平铺了N次。我想要计算每个元素的使用次数。

例如，以下（我重新整形了数组以使重复变得明显）：

[0, 1, 2, 0, 0, 1, 0] \
[0, 1, 2, 0, 0, 1, 0] ...

会变成：

[0, 0, 0, 1, 2, 1, 3] \
[4, 2, 1, 5, 6, 3, 7]

这个python代码实现了它，虽然不是很优雅而且很慢：

def running_counts(ar):
    from collections import defaultdict
    counts = defaultdict(lambda: 0)
    def get_count(num):
        c = counts[num]
        counts[num] += 1
        return c
    return [get_count(num) for num in ar]

我可以几乎看到一个愚蠢的技巧来实现这一目标，但并不完全。

更新

好的，我已经做了改进，但仍然依赖于上面的running_counts方法。以下内容加快了速度，对我来说感觉正确：

def sample_counts(ar, repititions):
    tile_bins = np.histogram(ar, np.max(ar)+1)[0]
    tile_mult = tile_bins[ar]
    first_steps = running_counts(ar)
    tiled = np.tile(tile_mult, repititions).reshape(repititions, -1)
    multiplier = np.reshape(np.arange(repititions), (repititions, 1))
    tiled *= multiplier
    tiled += first_steps
    return tiled.ravel()

摆脱running_counts()的任何优雅想法？速度现在还可以;它只是感觉有点不雅。

Answer 1

这是我的看法：

def countify2(ar):
    ar2 = np.ravel(ar)
    ar3 = np.empty(ar2.shape, dtype=np.int32)
    uniques = np.unique(ar2)
    myarange = np.arange(ar2.shape[0])
    for u in uniques:
        ar3[ar2 == u] = myarange
    return ar3

当元素多于唯一元素时，此方法最有效。

是的，它与Sven类似，但我确实在他发布之前就写了很久。我只是跑到某个地方。

如何获取numpy数组值的运行计数？

更新

1 个答案: