我有一个问题,
我最初的df是这样的:
Col 1 Col 2 Col 3
laura purchase 1 dress23
laura puchase 2 skirt55
laura purchase 3 shirt47
laura purchase 4 coat45
julia puchase 1 skirt74
julia purchase 2 short74
julia purchase 3 coat14
julia purchase 4 coat15
我希望使用Panda库获取它
Col 1 Purchase 1 Purchase 2 Purchase 3 Purchase 4
Laura dresse23 skirt55 shirt47 coat45
Julia skirt74 short74 coat14 coast10
拜托,你能帮帮我吗?
非常好,
谢谢,
<磷>氮答案 0 :(得分:1)
鉴于数据:
col 1 col 2 col 3
0 laura purchase 1 dress23
1 laura purchase 2 skirt55
2 laura purchase 3 shirt47
3 laura purchase 4 coat45
4 julia purchase 1 skirt74
5 julia purchase 2 short74
6 julia purchase 3 coat14
7 julia purchase 4 coat15
转型:
df = df.pivot(index='col 1', columns='col 2', values='col 3').reset_index()
df = df.rename(columns={'col 1': 'name'})
df.columns.name = 'id'
print(df)
结果:
id name purchase 1 purchase 2 purchase 3 purchase 4
0 julia skirt74 short74 coat14 coat15
1 laura dress23 skirt55 shirt47 coat45
答案 1 :(得分:1)
使用set_index
,unstack
和reset_index
:
df.set_index(['Col 1','Col 2'])['Col 3'].unstack().reset_index()
输出:
Col 2 Col 1 puchase 1 puchase 2 purchase 1 purchase 2 purchase 3 purchase 4
0 julia skirt74 None None short74 coat14 coat15
1 laura None skirt55 dress23 None shirt47 coat45
首先进行一些数据清理,结果看起来像。 新输入df,
Col 1 Col 2 Col 3
0 laura purchase 1 dress23
1 laura purchase 2 skirt55
2 laura purchase 3 shirt47
3 laura purchase 4 coat45
4 julia purchase 1 skirt74
5 julia purchase 2 short74
6 julia purchase 3 coat14
7 julia purchase 4 coat15
现在,执行pandas重塑:
df.set_index(['Col 1','Col 2'])['Col 3'].unstack().reset_index()
输出:
Col 2 Col 1 purchase 1 purchase 2 purchase 3 purchase 4
0 julia skirt74 short74 coat14 coat15
1 laura dress23 skirt55 shirt47 coat45
或使用pivot
和reset_index
:
df.pivot(index='Col 1',columns = 'Col 2', values= 'Col 3').reset_index()
输出:
Col 2 Col 1 purchase 1 purchase 2 purchase 3 purchase 4
0 julia skirt74 short74 coat14 coat15
1 laura dress23 skirt55 shirt47 coat45