Question

我有两个数组，这些数组的索引是相关的。所以x [0]与y [0]有关，所以它们需要保持井井有条。我已将x数组分成两个分区，如下面的代码所示。

x = [1,4,7,0,5]
y = [.1,.7,.6,.8,.3]

binx = [0,4,9]
index = np.digitize(x,binx)

给我以下内容：

In [1]: index
Out[1]: array([1, 2, 2, 1, 2])

到目前为止一切顺利。（我认为）

y数组是一个参数，告诉我x数据点的测量结果如何，所以.9优于.2，所以我使用下一个代码来整理y数组的最佳值：

y.sort() 
ysorted = y[int(len(y) * .5):]

给了我：

In [2]: ysorted
Out[2]: [0.6, 0.7, 0.8]

给我最后50％的数组。再次，这就是我想要的。

我的问题是如何将这两项操作结合起来？从每个bin，我需要获得最好的50％并将这些新值放入一个新的x和new y数组中。再次，保持每个数组的索引组织。或者有更简单的方法吗？我希望这是有道理的。

Answer 1

您应该从x和y列表中列出对

可以通过zip功能实现：

x = [1,4,7,0,5]
y = [.1,.7,.6,.8,.3]
values = zip(x, y)
values
[(1, 0.1), (4, 0.7), (7, 0.6), (0, 0.8), (5, 0.3)]

要按每对中的特定元素对这样的对列表进行排序，您可以使用sort的关键参数：

values.sort(key=lambda pair: pair[1])
[(1, 0.1), (5, 0.3), (7, 0.6), (4, 0.7), (0, 0.8)]

然后你可以用这个排序的对列表做任何你想做的事。

Answer 2

许多numpy个函数都有arg...个变种，这些变体不会按价值操作＆＃34;＆＃34;而是＃34;索引＆＃34;。在你的情况下argsort做你想做的事：

order = np.argsort(y)
# order is an array of indices such that
# y[order] is sorted
top50 = order[len(order) // 2 :]
top50x = x[top50]
# now top50x are the x corresponding 1-to-1 to the 50% best y

然后分箱然后在每个箱子中排序数组但保持它们的索引在一起

2 个答案: