Question

a = np.array([5,8,3,4,2,5,7,8,1,9,1,3,4,7])
b = np.array ([3,4,7,8,1,3])

我有两个整数列表，每个整数列表按每两个连续项（即索引[0,1]，[2,3]等）分组。无论是在相同顺序还是相反顺序中，都不能在任一列表中找到重复的项目对。

一个列表明显更大，包含另一个列表。我试图找出一种获得指数的有效方法较大列表的分组项目也在较小的列表中。

上例中的所需输出应为：

[2,3,6,7,10,11] #indices

请注意，作为一个例子，第一组（[3,4]）不应该将索引11,12作为匹配，因为在这种情况下，3是[1,3]的第二个元素，4是第一个元素[4,7]。

Answer 1

由于您要按对数组进行分组，因此可以将它们重新整形为2列以进行比较。然后，您可以将较短数组中的每个元素与较长的数组进行比较，并减少布尔数组。从那里使用重新塑造的np.arange来获取指数是一件简单的事情。

import numpy as np
from functools import reduce

a = np.array([5,8,3,4,2,5,7,8,1,9,1,3,4,7])
b = np.array ([3,4,7,8,1,3])

# reshape a and b into columns
a2 = a.reshape((-1,2))
b2 = b.reshape((-1,2))

# create a generator of bools for the row of a2 that holds b2
b_in_a_generator = (np.all(a2==row, axis=1) for row in b2)

# reduce the generator to get an array of boolean that is True for each row
# of a2 that equals one of the rows of b2
ix_bool = reduce(lambda x,y: x+y, b_in_a_generator)

# grab the indices by slicing a reshaped np.arange array
ix = np.arange(len(a)).reshape((-1,2))[ix_bool]

ix
# returns:
array([[ 2,  3],
       [ 6,  7],
       [10, 11]])

如果你想要一个平面阵列，只需要拉扯ix

ix.ravel()
# returns
array([ 2,  3,  6,  7, 10, 11])

Answer 2

这是一种利用NumPy view元素组的方法 -

# Taken from https://stackoverflow.com/a/45313353/
def view1D(a, b): # a, b are arrays
    a = np.ascontiguousarray(a)
    void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
    return a.view(void_dt).ravel(),  b.view(void_dt).ravel()

def grouped_indices(a, b):
    a0v, b0v = view1D(a.reshape(-1,2), b.reshape(-1,2))
    sidx = a0v.argsort()
    idx = sidx[np.searchsorted(a0v,b0v, sorter=sidx)]
    return ((idx*2)[:,None] + [0,1]).ravel()

如果b中a的任何群组之间没有会员资格，我们可以使用掩码过滤掉a0v[idx] == b0v。

示例运行 -

In [345]: a
Out[345]: array([5, 8, 3, 4, 2, 5, 7, 8, 1, 9, 1, 3, 4, 7])

In [346]: b
Out[346]: array([3, 4, 7, 8, 1, 3])

In [347]: grouped_indices(a, b)
Out[347]: array([ 2,  3,  6,  7, 10, 11])

另一位使用np.in1d替换np.searchsorted -

的人

def grouped_indices_v2(a, b):
    a0v, b0v = view1D(a.reshape(-1,2), b.reshape(-1,2))
    return (np.flatnonzero(np.in1d(a0v, b0v))[:,None]*2 + [0,1]).ravel()

查找两个数组

2 个答案: