Question

我有两个numpy数组，比如X=[x1,x2,x3,x4], y=[y1,y2,y3,y4]。其中三个元素很接近，第四个元素可能接近或不接近。

像：

X   [ 84.04467948  52.42447842  39.13555678  21.99846595]
y   [ 78.86529444  52.42447842  38.74910101  21.99846595]

或者它可以是：

X   [ 84.04467948  60  52.42447842  39.13555678]
y   [ 78.86529444  52.42447842  38.74910101  21.99846595]

我想定义一个函数来查找两个数组中的相应索引，如第一种情况：

y[0]对应X[0]，
y[1]对应X[1]，
y[2]对应X[2]，
y[3]对应X[3]

在第二种情况下：

y[0]对应X[0]，
y[1]对应X[2]，
y[2]对应X[3]
和y[3]对应X[1]。

我无法编写完全解决问题的功能，请帮忙。

Answer 1

使用此回答https://stackoverflow.com/a/8929827/3627387和https://stackoverflow.com/a/12141207/3627387

<强>固定

def find_closest(alist, target):
    return min(alist, key=lambda x:abs(x-target))

X = [ 84.04467948,  52.42447842,  39.13555678,  21.99846595]
Y = [ 78.86529444,  52.42447842,  38.74910101,  21.99846595]

def list_matching(list1, list2):
    list1_copy = list1[:]
    pairs = []
    for i, e in enumerate(list2):
        elem = find_closest(list1_copy, e)
        pairs.append([i, list1.index(elem)])
        list1_copy.remove(elem)
    return pairs

Answer 2

似乎最好的方法是对两个数组进行预排序（n log（n）），然后通过两个数组执行类似合并的遍历。它肯定比你在评论中指出的n n更快。

Answer 3

您可以从预先计算距离矩阵开始，如answer：

所示

import numpy as np

X = np.array([84.04467948,60.,52.42447842,39.13555678])
Y = np.array([78.86529444,52.42447842,38.74910101,21.99846595])

dist = np.abs(X[:, np.newaxis] - Y)

现在您可以计算沿一个轴的最小值（我选择1对应于为Y找到X的最接近元素）：

potentialClosest = dist.argmin(axis=1)

这仍然可能包含重复项（在您的情况下为2）。要检查这一点，您可以使用np.unique找到Y中显示的所有potentialClosest索引：

closestFound, closestCounts = np.unique(potentialClosest, return_counts=True)

现在，您可以通过检查是否closestFound.shape[0] == X.shape[0]来检查重复项。如果是这样，那么您将成为金牌，potentialClosest将包含X中每个元素的合作伙伴。在你的情况2中，一个元素将出现两次，因此closestFound只有X.shape[0]-1个元素，而closestCounts不仅包含1个，而只有一个2 。对于计数为1的所有元素，已找到合作伙伴。对于计数为2的两位候选人，虽然您必须选择距离较近的候选人，但距离较大的候选人的合作伙伴将是Y中不在closestFound的一个元素。 }。这可以找到：

missingPartnerIndex = np.where(
        np.in1d(np.arange(Y.shape[0]), closestFound)==False
        )[0][0]

你可以在循环中执行matchin（即使使用numpy可能有更好的方法）。这个解决方案相当丑陋但有效。任何改进建议都非常感谢：

partners = np.empty_like(X, dtype=int)
nonClosePartnerFound = False
for i in np.arange(X.shape[0]):
    if closestCounts[closestFound==potentialClosest[i]][0]==1:
        # A unique partner was found
        partners[i] = potentialClosest[i]
    else:
        # Partner is not unique
        if nonClosePartnerFound:
            partners[i] = potentialClosest[i]
        else:
            if np.argmin(dist[:, potentialClosest[i]]) == i:
                partners[i] = potentialClosest[i]
            else:
                partners[i] = missingPartnerIndex
                nonClosePartnerFound = True
print(partners)

这个答案只有在一对没有关闭的情况下才有效。如果不是这种情况，则必须定义如何为多个非关闭元素找到正确的伙伴。可悲的是，它不是一个非常通用的解决方案，但希望你会发现它是一个有用的起点。

Answer 4

下面简单地打印出你在问题中所做的两个数组的相应索引，因为我不确定你希望你的函数给出什么输出。

X1 = [84.04467948, 52.42447842, 39.13555678, 21.99846595]
Y1 = [78.86529444, 52.42447842, 38.74910101, 21.99846595]

X2 = [84.04467948, 60, 52.42447842, 39.13555678]
Y2 = [78.86529444, 52.42447842, 38.74910101, 21.99846595]

def find_closest(x_array, y_array):
    # Copy x_array as we will later remove an item with each iteration and
    # require the original later
    remaining_x_array = x_array[:]
    for y in y_array:
        differences = []
        for x in remaining_x_array:
            differences.append(abs(y - x))
        # min_index_remaining is the index position of the closest x value
        # to the given y in remaining_x_array
        min_index_remaining = differences.index(min(differences))
        # related_x is the closest x value of the given y
        related_x = remaining_x_array[min_index_remaining]
        print 'Y[%s] corresponds to X[%s]' % (y_array.index(y), x_array.index(related_x))
        # Remove the corresponding x value in remaining_x_array so it
        # cannot be selected twice
        remaining_x_array.pop(min_index_remaining)

然后输出以下

find_closest(X1,Y1)
Y[0] corresponds to X[0]
Y[1] corresponds to X[1]
Y[2] corresponds to X[2]
Y[3] corresponds to X[3]

和

find_closest(X2,Y2)
Y[0] corresponds to X[0]
Y[1] corresponds to X[2]
Y[2] corresponds to X[3]
Y[3] corresponds to X[1]

希望这有帮助。

如何在两个数组中找到最接近的元素？

4 个答案: