Question

我正在尝试创建一个数组，其中包含一个（非常大的数组）与一组唯一值匹配的所有行。问题是大型数组将在匹配的地方有多行，而我需要将它们全部存储在新数组的同一行中。

使用for循环遍历每个唯一值都可以，但是太慢而无法使用。我一直在寻找矢量化解决方案，但没有成功。任何帮助将不胜感激！

    arrStart = []
    startRavel = startInforce['pol_id'].ravel()
    for policy in unique_policies:
        arrStart.append(np.argwhere(startRavel == policy))

新数组的长度与唯一值数组的长度相同，但是每个元素都是与大数组中的唯一值匹配的所有行的列表。

样本输入将如下所示： startRavel = [1,2,2,2,3,3] unique_policies = [1,2,3]

输出： arrStart = [[0]，[1,2,3]，[4,5]]

Answer 1

NumPy的一个可能选项，类似于您的选择，但列表理解变得扁平：

startRavel = np.array([1,2,2,2,3,3])
unique_policies = np.array([1,2,3])

[np.argwhere(startRavel == policy).flatten() for policy in unique_policies]
#=> [array([0]), array([1, 2, 3]), array([4, 5])]

或者，使用flatnonzero()：

[np.flatnonzero(startRavel == policy) for policy in unique_policies]

发电机版本：

def matches_indexes(startRavel, unique_policies):
  for policy in unique_policies:
    yield np.flatnonzero(startRavel == policy)

在数组中搜索所有匹配项并返回匹配项的索引

1 个答案: