如果我有数据框:
>>> import pandas as pd
>>> df = pd.DataFrame({'grp':['A', 'A', 'B', 'B', 'B'], 'pos' : [1, 2, 1, 2, 3], 'desc1' : ['X1', 'X2', 'Y1', 'Y2', 'Y3'], 'desc2' : ['A1', 'A2', 'A1', 'A2', 'A3']})
>>> df['desc'] = df.desc1 + ' (' + df.desc2 + ')'
>>> df = df.drop(columns=['desc1', 'desc2'])
>>> df
grp pos desc
0 A 1 X1 (A1)
1 A 2 X2 (A2)
2 B 1 Y1 (A1)
3 B 2 Y2 (A2)
4 B 3 Y3 (A3)
>>>
我想将i转换为以下数据框:
grp pos1 pos2 pos3
0 A X1 (A1) X2 (A2) None
1 B Y1 (A1) Y2 (A2) Y3 (A3)
我想按“ grp”对所有内容进行分组,并希望每个位置在列单元格中都有desc。 组的职位数量可变。 怎么做?
致谢。
答案 0 :(得分:2)
IIUC pd.crosstab
df1 = pd.crosstab(df.grp,df.pos,df.desc,aggfunc=lambda x : x)\
.add_prefix('pos')\
.reset_index()\
.rename_axis(None,axis=1)
print(df1)
grp pos1 pos2 pos3
0 A X1 (A1) X2 (A2) NaN
1 B Y1 (A1) Y2 (A2) Y3 (A3)
答案 1 :(得分:1)
或者您可以使用groupby和unstack解决它:
import pandas as pd
df = pd.DataFrame({'grp':['A', 'A', 'B', 'B', 'B'], 'pos' : [1, 2, 1, 2, 3], 'desc1' : ['X1', 'X2', 'Y1', 'Y2', 'Y3'], 'desc2' : ['A1', 'A2', 'A1', 'A2', 'A3']})
df['desc'] = df.desc1 + ' (' + df.desc2 + ')'
df = df.drop(columns=['desc1', 'desc2'])
df1 = df.groupby(['grp', 'pos'])['desc'].first().unstack('pos')
print(df1)
#Output:
pos 1 2 3
grp
A X1 (A1) X2 (A2) NaN
B Y1 (A1) Y2 (A2) Y3 (A3)