我的熊猫数据框如下:
import pandas as pd
import numpy as np
df = pd.DataFrame({'CATEGORY': [1, 1, 2, 2],
'GROUP': ['A', 'A', 'B', 'B'],
'XYZ': [3000, 2500, 3000, 3000],
'VAL': [3000, 2500, 3000, 3000],
'A_CLASS': [3000, 2500, 3000, 3000],
'B_CAL': [3000, 4500, 3000, 1000],
'C_CLASS': [3000, 2500, 3000, 3000],
'A_CAL': [3000, 2500, 3000, 3000],
'B_CLASS': [3000, 4500, 3000, 500],
'C_CAL': [3000, 2500, 3000, 3000],
'ABC': [3000, 2500, 3000, 3000]})
df
CATEGORY GROUP XYZ VAL A_CLASS B_CAL C_CLASS A_CAL B_CLASS C_CAL ABC
1 A 3000 1 3000 3000 3000 3000 3000 3000 3000
1 A 2500 2 2500 4500 2500 2500 4500 2500 2500
2 B 3000 4 3000 3000 3000 3000 3000 3000 3000
2 B 3000 1 3000 1000 3000 3000 500 3000 3000
我希望在我的最终数据框中按以下顺序排列列
组,类别,后缀为“ _CAL”的所有列,后缀为“ _CLASS”的所有列,所有其他字段
我的预期输出:
GROUP CATEGORY B_CAL A_CAL C_CAL A_CLASS C_CLASS B_CLASS XYZ VAL ABC
A 1 3000 3000 3000 3000 3000 3000 3000 1 3000
A 1 4500 2500 2500 2500 2500 4500 2500 2 2500
A 1 8000 7000 8000 8000 8000 8000 8000 5 8000
B 2 3000 3000 3000 3000 3000 3000 3000 4 3000
B 2 1000 3000 3000 3000 3000 500 3000 1 3000
答案 0 :(得分:3)
与sorted
一起玩:
first = ['GROUP','CATEGORY']
cols = sorted(df.columns.difference(first),
key=lambda x: (not x.endswith('_CAL'), not x.endswith('_CLASS')))
df[first+cols]
GROUP CATEGORY A_CAL B_CAL C_CAL A_CLASS B_CLASS C_CLASS ABC VAL \
0 A 1 3000 3000 3000 3000 3000 3000 3000 3000
1 A 1 2500 4500 2500 2500 4500 2500 2500 2500
2 B 2 3000 3000 3000 3000 3000 3000 3000 3000
3 B 2 3000 1000 3000 3000 500 3000 3000 3000
XYZ
0 3000
1 2500
2 3000
3 3000
有关更多详细信息,here's a similar one
答案 1 :(得分:2)
您只需要玩弦乐
cols = df.columns
cols_sorted = ["GROUP", "CATEGORY"] +\
[col for col in cols if col.endswith('_CAL')] +\
[col for col in cols if col.endswith('_CLASS')]
cols_sorted += sorted([col for col in cols if col not in cols_sorted])
df = df[cols_sorted]