Question

这个功能如何运作？

import numpy as np
first_names = (5,5,5)
last_names = (3,1,2)
x = np.lexsort((first_names, last_names))
print(x)

它给出输出[1 2 0]。我假设这两个列表按变量last_names排序。如果是这样，数字2如何具有索引0. 2介于1和3之间，所以我不明白这种排序是如何工作的。请解释一下。

Answer 1

基本上，np.lexsort((first_names, last_names))说：首先按last_name排序，然后按first_name排序

阅读documentation，特别是“排序两列数字：”下的示例，揭示了很多内容。基本上，您首先按last_name排序，重新排序，以便索引1（其值为1）为第一，索引2（其值为2）为秒，索引0（其值为3）是第三位。使用此订单，已排序的last_name最终为(1,2,3)，即已排序。然后，如果有任何关系，first_name中的相应索引将成为平局。

例如，考虑这种情况：

first_names = (5,5,4)
last_names = (3,1,1)

1中的索引2和last_name之间存在联系（它们都具有值1），这将由first_name中的相应索引打破。在1的索引2和first_name，索引2（值4）低于索引1（值5），因此它将首先出现。因此，生成的lexsort将为[2,1,0]：

np.lexsort((first_names, last_names))
# array([2, 1, 0])

Answer 2

它返回[1, 2, 0]，因为索引1对应于姓氏中的“1”。 2对应于'2'，0对应'3'。将返回值视为您需要用于对数组进行排序的索引顺序：

last_names[1], last_names[2], last_names[0] 
# 1, 2, 3

对数组进行排序。

Answer 3

以Layman术语表示：
首先，让我们分别对名字和姓氏进行排序。

first_names = np.array(['Betsey', 'Shelley', 'Lanell', 'Genesis', 'Margery'])
last_names = np.array(['Battle', 'Brien', 'Plotner', 'Stahl', 'Woolum'])
first_names.sort()
first_names

>>array(['Betsey', 'Genesis', 'Lanell', 'Margery', 'Shelley'], dtype='<U7')

last_names.sort()
last_names

>>array(['Battle', 'Brien', 'Plotner', 'Stahl', 'Woolum'], dtype='<U7')

[first_names[i] + ' ' + last_names[i] for i in range(len(first_names))]

>>array(['Betsey Battle', 'Genesis Brien', 'Lanell Plotner', 'Margery Stahl', 'Shelley Woolum'])

现在可以说我们只想按名字对名字排序

first_names = np.array(['Betsey', 'Shelley', 'Lanell', 'Genesis', 'Margery'])
last_names = np.array(['Battle', 'Brien', 'Plotner', 'Stahl', 'Woolum'])
_ = np.lexsort((last_names, first_names))
[first_names[i] + ' ' + last_names[i] for i in _]

>>['Betsey Battle', 'Genesis Stahl', 'Lanell Plotner', 'Margery Woolum', 'Shelley Brien']

在这里，出现了一个明显的问题：
如果可以使用array.sort（）方法进行排序，np.lexsort（）的意义是什么？
如果您仔细查看前2个输出，则可以找到答案。

为简单起见，现在让我们来看另一个具有2个相似名字的场景。

first_names = np.array(['Betsey', 'Shelley', 'Lanell', 'Betsey', 'Margery'])
last_names = np.array(['Battle', 'Brien', 'Plotner', 'Stahl', 'Woolum'])

使用简单的sort（）方法无法对具有相应姓氏的名字进行排序，但是lexsort（）可以根据输入参数进行排序。

_ = np.lexsort((last_names, first_names))
[first_names[i] + ' ' + last_names[i] for i in _]

>>['Betsey Battle', 'Betsey Stahl', 'Lanell Plotner', 'Margery Woolum', 'Shelley Brien']

我们可以对多个数组（例如np.lexsort((last_names, middle_names, first_names))）执行相同的操作。
数组最初将根据first_names进行排序，如果有相似的值，则按middle_names进行排序，依此类推...

Answer 4

成对组织两个列表，按索引 -> [0, 0, 1, 1, 2, 2 ...] 升序，在这种情况下，注意输出：

# idx:         0   1   2  3  4  5  6
a = np.array ([9, 74, 1, 3, 4, 89, 6])
b = np.array ([4, 6, 9, 2, 1, 8, 7])

输出：[2 3 4 6 0 1 5]

第一个数字是 2，它是 a ([]) 的最小数字，并且将与 9 翻倍，因为它们的索引相同。回顾一下，a ([]) 的第二小数是数字 3，它将与 b ([]) 的 2 配对，因为它们具有相同的索引！

函数np.lexsort如何工作？

4 个答案: