如何使用非标准顺序对pandas中的行进行排序

时间:2016-08-09 17:46:51

标签: python sorting pandas

我有一个pandas数据框,比如说:

df = pd.DataFrame ([['a', 3, 3], ['b', 2, 5], ['c', 4, 9], ['d', 1, 43]], columns = ['col 1' , 'col2', 'col 3'])

或:

  col 1  col2  col 3
0     a     3      3
1     b     2      5
2     c     4      9
3     d     1     43

如果我想按col2排序,我可以使用df.sort,这将按升序和降序排序。

但是,如果我想对行进行排序以使col2为:[4,2,1,3],我该怎么做?

3 个答案:

答案 0 :(得分:4)

一种方法是将该列转换为Categorical类型,该类型可以具有任意顺序。

In [51]: df['col2'] = df['col2'].astype('category', categories=[4, 1, 2, 3], ordered=True)

In [52]: df.sort_values('col2')
Out[52]: 
  col 1 col2  col 3
2     c    4      9
3     d    1     43
1     b    2      5
0     a    3      3

答案 1 :(得分:4)

试试这个:

sortMap = {4:1, 2:2, 1:3,3:4 }
df["new"] = df2['col2'].map(sortMap)
df.sort_values('new', inplace=True)
df

   col1  col2  col3  new
2    c     4     9    1
1    b     2     5    2
3    d     1    43    3
0    a     3     3    4

创建dict的alt方法:

ll      = [4, 2, 1, 3] 
sortMap = dict(zip(ll,range(len(ll))))

答案 2 :(得分:1)

替代解决方案:

BEGIN  
    SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

    WITH tblRegionais AS 
    (
        SELECT DISTINCT
            [R].COD_Regional,
            [F].COD_Regional AS [COD_RegionalReal],
            [R].Nom_Regional
        FROM
            COR_Regional [R] WITH(NOLOCK)
        INNER JOIN
            COR_FILIAL [F] WITH(NOLOCK) ON [R].COD_REGIONAL = [F].COD_RegionalAtual
        INNER JOIN
            APS_AcessoFilial [AF] WITH(NOLOCK) ON [F].COD_Regional = [AF].COD_Regional
                                               AND [F].COD_Filial = [AF].COD_Filial
        WHERE
            [F].FLG_SituacaoRegistro = 1
            AND [AF].FLG_Situacao = 1
            AND [AF].COD_Func = @COD_Func
    ),
    tblChegadas AS
    ( 
        SELECT
            [R].COD_Regional,
            COUNT([C].ID_Chegada) AS [QTD_Chegada]
        FROM
            tblRegionais [R]
        INNER JOIN
            APS_Chegada [C] WITH(NOLOCK) ON [R].COD_RegionalReal = [C].COD_Regional
        WHERE
            [C].ID_ChegadaStatus = 2
        GROUP BY
            [R].COD_Regional
    ),
    tblSaida AS 
    (
        SELECT 
            [R].COD_Regional,
            RTRIM([R].Nom_Regional) + ' (' + CAST([C].QTD_Chegada AS VARCHAR(30)) + ')' AS [NOM_Regional]
        FROM
            tblRegionais [R]
        INNER JOIN
            tblChegadas [C] ON [R].COD_Regional = [C].COD_Regional
    )
    SELECT 
        [S].COD_Regional,
        [S].NOM_Regional
    FROM
        tblSaida [S];       
END

注意:我更喜欢@chrisb's solution - 它更优雅,可能会更快地运作