Question

我有许多由0、1、2、3数字组成的2D数组1161 x 1161。例如其中之一是通过以下方式组成的：

521859个零，288972个，481471个，55619个三分。

我想找到最快的方法来获得相同的数组，但是现在出现次数最少的是零，第二个出现次数是1，依此类推，给出了相同的数组，但是现在由以下组成：

55619个零，288972个，481471个二进制，521859个三进制

如果有非常Python化的方式，那当然很好

在此先感谢您的帮助！

Answer 1

您可以使用np.unique来获取唯一元素和计数，然后构建一个字典，其中的键是旧值，而键是新值。最后使用np.vectorize将其应用于整个数组：

import numpy as np
from operator import itemgetter

arr = np.array([2, 2, 0, 0, 0, 1, 3, 3, 3, 3])

# get unique elements and counts
counts = zip(*np.unique(arr, return_counts=True))

# create a lookup dictionary value -> i where values are sorted according to frequency
mapping = {value: i for i, (value, _) in enumerate(sorted(counts, key=itemgetter(1)))}

# apply the dictionary in a vectorized way
result = np.vectorize(mapping.get)(arr)

print(result)

输出

[1 1 2 2 2 0 3 3 3 3]

一种可能更清洁的替代方法是使用collections.Counter来计数和创建映射字典：

# get unique elements and counts
counts = Counter(arr)

# create a lookup dictionary value -> i where values are sorted according to frequency
mapping = {value: i for i, value in enumerate(sorted(counts, key=counts.get))}

2D numpy数组根据出现次数替换值

1 个答案: