我有
df = pd.DataFrame({
'key': ['value1','value2','value1','value2'],
'domain': ['domain1.com','domain1.com','domain2.com','domain2.com'],
'url' :['urlB','urlA','url1','url2'],
'score' : [12,14,200,2001]})
我想得到结果 result
我已经尝试过转置,堆栈...但是无法得到相同的结果。
我是Python Pandas的新手, 请指教
[编辑]
感谢@jezrael的回复,它可以通过
使用df = df.set_index(['key','domain']).unstack().swaplevel(0,1, axis=1).sort_index(axis=1)
移至下一级进行排序, 我从头开始添加更多行:
df = pd.DataFrame({
'key': ['value1','value2','value1','value2','value2','value3'],
'domain': ['domain1.com','domain1.com','domain2.com','domain2.com','domain3.com','domain4.com'],
'url' :['urlB','urlA','url1','url2','url3','url4'],
'score' : [12,14,200,2001,10,5]
})
dfdomains = pd.DataFrame({
'domain': ['domain1.com','domain2.com', 'domain3.com','domain4.com'],
'order' : [3,1,2,4]
})
我通过您的答案得到了数据帧:
df1 = df.set_index(['key','domain']).unstack().swaplevel(0,1, axis=1).sort_index(axis=1, ascending=False)
那给了我结果:
domain domain4.com domain3.com domain2.com domain1.com
url score url score url score url score
key
value1 NaN NaN NaN NaN url1 200.0 urlB 12.0
value2 NaN NaN url3 10.0 url2 2001.0 urlA 14.0
value3 url4 5.0 NaN NaN NaN NaN NaN NaN
我想用sort df1
来order of dfdomains
:这意味着df1
的第一列是domain2.com (order= 1)
期望:image
您能给我提个建议吗? 谢谢
答案 0 :(得分:3)
使用:
df = df.set_index(['key','domain']).unstack().swaplevel(0,1, axis=1).sort_index(axis=1)
print (df)
domain domain1.com domain2.com
score url score url
key
value1 12 urlB 200 url1
value2 14 urlA 2001 url2
set_index
代表MultiIndex
unstack
进行整形以进行整形,MultiIndex
列中的swaplevel
sort_index
排序编辑:首先sort_values
用于按列order
进行排序,然后添加DataFrame.reindex
-必须将order
的所有值都放在df['domain']
中>
order = dfdomains.sort_values('order')['domain']
print (order)
1 domain2.com
2 domain3.com
0 domain1.com
3 domain4.com
Name: domain, dtype: object
df1 = (df.set_index(['key','domain'])
.unstack()
.swaplevel(0,1, axis=1)
.sort_index(axis=1, ascending=False)
.reindex(order, axis=1, level=0))
print (df1)
domain domain2.com domain3.com domain1.com domain4.com \
url score url score url score url
key
value1 url1 200.0 NaN NaN urlB 12.0 NaN
value2 url2 2001.0 url3 10.0 urlA 14.0 NaN
value3 NaN NaN NaN NaN NaN NaN url4
domain
score
key
value1 NaN
value2 NaN
value3 5.0