Question

说我有这两个数组：

dictionary = np.array(['a', 'b', 'c'])
array = np.array([['a', 'a', 'c'], ['b', 'b', 'c']])

我想将array中的每个元素替换为dictionary中其值的索引。所以：

for index, value in enumerate(dictionary):
    array[array == value] = index
array = array.astype(int)

获得：

array([[0, 0, 2],
       [1, 1, 2]])

有没有矢量化的方法来做到这一点？我知道，如果array已经包含索引，而我想要dictionary中的字符串，我可以dictionary[array]。但我实际上需要一个＆＃34;查找＆＃34;这里的字符串。

（我也看到this answer，但想知道自2010年以来是否有新的东西可用。）

Answer 1

如果您的字典已排序，字典和数组包含相同的元素，Comments.find(4)可以解决问题

np.unique

如果数组中缺少某些元素：

uniq, inv = np.unique(array, return_inverse=True)
result = inv.reshape(array.shape)

一般情况：

uniq, inv = np.unique(np.r_[dictionary, array.ravel()], return_inverse=True)
result = inv[len(dictionary):].reshape(array.shape)

解释：我们在这里使用它的形式的uniq, inv = np.unique(np.r_[dictionary, array.ravel()], return_inverse=True) back = np.empty_like(inv[:len(dictionary)]) back[inv[:len(dictionary)]] = np.arange(len(dictionary)) result=back[inv[len(dictionary):]].reshape(array.shape)以排序的顺序返回唯一元素，并将索引返回到参数的每个元素的这个排序列表中。因此，要将索引放入原始字典，我们需要重新映射索引。我们知道np.unique。因此，我们必须解决代码所做的uniq[inv[:len(uniq)]] == dictionary。

如何应用非整数 - ＆gt;整数字典到一个numpy数组？

1 个答案: