Question

给出这样的数据框：

id   col1     col2           col3
------------------------------------------
1    [2,3,1]  ['a','b','c']  ['d','e','f']
2    [3,2,1]  ['a','b','c']  ['d','e','f']

使用col1中的排序后的值对col2和col3，col1中的列表进行排序的最有效方法是什么？

id   col1     col2           col3
------------------------------------------
1    [1,2,3]  ['c','a','b']  ['f','d','e']
2    [1,2,3]  ['c','b','a']  ['f','e','d']

谢谢。

Answer 1

您可以尝试以下方法：

df = pd.DataFrame({'col1':[ [2,3,1], [3,2,1]  ],
                   'col2':[ ['a','b','c'], ['a','b','c'] ],
                   'col3':[ ['d','e','f'], ['d','e','f'] ]})

def custom_sort(x):
    col1 = sorted(enumerate(x.col1), key=lambda k: k[1])
    col2 = [x.col2[i] for i, _ in col1]
    col3 = [x.col3[i] for i, _ in col1]
    return [v for _, v in col1], col2, col3

df[['col1', 'col2', 'col3']] = df[['col1', 'col2', 'col3']].apply(custom_sort, axis=1, result_type='expand')
print(df)

打印：

        col1       col2       col3
0  [1, 2, 3]  [c, a, b]  [f, d, e]
1  [1, 2, 3]  [c, b, a]  [f, e, d]

Answer 2

我将在argsort上使用numpy col1，并使用apply在每列上使用花式索引

m = np.array(df.col1.tolist()).argsort()
i_0 = np.arange(df.shape[0])[:,None]    
df[['col1','col2','col3']] = df[['col1','col2','col3']].apply(lambda x: 
                                            np.array(x.tolist())[i_0, m].tolist())

Out[1700]:
   id       col1       col2       col3
0   1  [1, 2, 3]  [c, a, b]  [f, d, e]
1   2  [1, 2, 3]  [c, b, a]  [f, e, d]

熊猫：根据另一个列列表中的值对列列表进行排序

2 个答案: