Question

我有两个NumPy数组：

A = asarray(['4', '4', '2', '8', '8', '8', '8', '8', '16', '32', '16', '16', '32'])
B = asarray(['2', '4', '8', '16', '32'])

我想要一个以A, B为参数的函数，并在B 中为A中的每个值返回索引，与{{1}对齐尽可能高效。

以上是测试用例的输出：

A

我尝试过探索indices = [1, 1, 0, 2, 2, 2, 2, 2, 3, 4, 3, 3, 4]，in1d()和where()但没有运气。非常感谢任何帮助。

编辑：数组是字符串。

Answer 1

你也可以这样做：

>>> np.digitize(A,B)-1
array([1, 1, 0, 2, 2, 2, 2, 2, 3, 4, 3, 3, 4])

根据文档，您应该能够指定right=False并跳过减去一部分。这对我不起作用，可能是由于版本问题，因为我没有numpy 1.7。

我不确定你在做什么，但是一个简单而快速的方法是：

>>> A = np.asarray(['4', '4', '2', '8', '8', '8', '8', '8', '16', '32', '16', '16', '32'])
>>> B,indices=np.unique(A,return_inverse=True)
>>> B
array(['16', '2', '32', '4', '8'],
      dtype='|S2')
>>> indices
array([3, 3, 1, 4, 4, 4, 4, 4, 0, 2, 0, 0, 2])

>>> B[indices]
array(['4', '4', '2', '8', '8', '8', '8', '8', '16', '32', '16', '16', '32'],
      dtype='|S2')

订单会有所不同，但如果需要可以更改。

Answer 2

对于这些事情，尽可能快地在B中进行查找非常重要。字典提供O(1)查找时间。所以，首先，让我们构建这个词典：

>>> indices = dict((value,index) for index,value in enumerate(B))
>>> indices
{8: 2, 16: 3, 2: 0, 4: 1, 32: 4}

然后只需浏览A并找到相应的索引：

>>> [indices[item] for item in A]
[1, 1, 0, 2, 2, 2, 2, 2, 3, 4, 3, 3, 4]

Answer 3

我认为你可以用np.searchsorted：

来做到这一点

>>> A = asarray([4, 4, 2, 8, 8, 8, 8, 8, 16, 32, 16, 16, 32])
>>> B = asarray([2, 8, 4, 32, 16])
>>> sort_b = np.argsort(B)
>>> idx_of_a_in_sorted_b = np.searchsorted(B, A, sorter=sort_b)
>>> idx_of_a_in_b = np.take(sort_b, idx_of_a_in_sorted_b)
>>> idx_of_a_in_b
array([2, 2, 0, 1, 1, 1, 1, 1, 4, 3, 4, 4, 3], dtype=int64)

请注意，B会从您的版本中加扰，因此输出不同。如果A中的某些项目不在B中（您可以使用np.all(np.in1d(A, B))检查），那么这些值的返回索引将是废话，您甚至可能会得到IndexError 1}}来自最后一行（如果A中缺少B中的最大值）。

Answer 4

numpy_indexed包（免责声明：我是它的作者）实现了与Jaime的解决方案相同的解决方案;但是有一个很好的界面，测试和许多相关的有用功能：

import numpy_indexed as npi
print(npi.indices(B, A))

Answer 5

我不确定这是多么有效但是有效：

import numpy as np
A = np.asarray(['4', '4', '2', '8', '8', '8', '8', '8', '16', '32', '16', '16', '32'])
B = np.asarray(['2', '4', '8', '16', '32'])
idx_of_a_in_b=np.argmax(A[np.newaxis,:]==B[:,np.newaxis],axis=0)
print(idx_of_a_in_b)

我从中得到：

[1 1 0 2 2 2 2 2 3 4 3 3 4]

获取数组B中的NumPy数组索引，以获取数组A中的唯一值，以获取两个数组中存在的值，与数组A对齐

5 个答案: