在2d数组的列中排名

时间:2015-01-02 02:34:16

标签: python numpy sorting scipy

>>> a = array([[10, 50, 20, 30, 40],
...            [50, 30, 40, 20, 10],
...            [30, 20, 20, 10, 50]])

>>> some_np_expression(a)
array([[1, 3, 1, 3, 2],
       [3, 2, 3, 2, 1],
       [2, 1, 2, 1, 3]])

什么是some_np_expression?只要排名是独特的和顺序的,就不要关心如何解决关系。

3 个答案:

答案 0 :(得分:6)

Double argsort是一种标准(但效率低下!)的方法:

In [120]: a
Out[120]: 
array([[10, 50, 20, 30, 40],
       [50, 30, 40, 20, 10],
       [30, 20, 20, 10, 50]])

In [121]: a.argsort(axis=0).argsort(axis=0) + 1
Out[121]: 
array([[1, 3, 1, 3, 2],
       [3, 2, 3, 2, 1],
       [2, 1, 2, 1, 3]])

使用更多代码,您可以避免排序两次。请注意,我在下面使用了不同的a

In [262]: a
Out[262]: 
array([[30, 30, 10, 10],
       [10, 20, 20, 30],
       [20, 10, 30, 20]])

拨打argsort一次:

In [263]: s = a.argsort(axis=0)

使用s构建排名数组:

In [264]: i = np.arange(a.shape[0]).reshape(-1, 1)

In [265]: j = np.arange(a.shape[1])

In [266]: ranked = np.empty_like(a, dtype=int)

In [267]: ranked[s, j] = i + 1

In [268]: ranked
Out[268]: 
array([[3, 3, 1, 1],
       [1, 2, 2, 3],
       [2, 1, 3, 2]])

这是效率较低(但更简洁)的版本:

In [269]: a.argsort(axis=0).argsort(axis=0) + 1
Out[269]: 
array([[3, 3, 1, 1],
       [1, 2, 2, 3],
       [2, 1, 3, 2]])

答案 1 :(得分:0)

现在Scipy提供了function来使用轴参数对数据进行排名-您可以沿着要对数据进行排名的轴进行设置。

from scipy.stats.mstats import rankdata    
a = array([[10, 50, 20, 30, 40],
           [50, 30, 40, 20, 10],
           [30, 20, 20, 10, 50]])

ranked_vertical = rankdata(a, axis=0) 

答案 2 :(得分:0)

Uncaught TypeError: Cannot read property 'backendAddress' of undefined
    at Module../src/environments/environment.ts (environment.ts:14)
    at __webpack_require__ (bootstrap:84)
    at Module../src/main.ts (main.ts:1)
    at __webpack_require__ (bootstrap:84)
    at Object.0 (main.ts:17)
    at __webpack_require__ (bootstrap:84)
    at checkDeferredModules (bootstrap:45)
    at Array.webpackJsonpCallback [as push] (bootstrap:32)
    at main-es2015.js:1

输出如下。

from scipy.stats.mstats import rankdata
import numpy as np

a = np.array([[10, 50, 20, 30, 40],
              [50, 30, 40, 20, 10],
              [30, 20, 20, 10, 50]])

rank = (rankdata(a, axis=0)-1).astype(int)