我在熊猫中有以下数据框
key no lpm
ab_12 1 12
ab_12 2 11
ab_12 3 11
ac_12 1 12
ac_12 2 11
ac_12 4 11
ad_12 1 12
ad_12 2 11
ad_12 3 11
我想要的数据框正在跟踪
key no_1 no_2 no_3 no_4
ab_12 12 11 11 does not exist
ac_12 12 11 does not exist 11
ad_12 12 11 11 does not exist
我正在大熊猫里玩耍,但是并不能满足我的需要。
df= df.melt('key').groupby(['key', 'value']).unstack(fill_value='Does not exist')
答案 0 :(得分:4)
将set_index
与unstack
和add_prefix
结合使用:
df = df.set_index(['key', 'no'])['lpm'].unstack(fill_value='Does not exist').add_prefix('no_')
print (df)
no no_1 no_2 no_3 no_4
key
ab_12 12 11 11 Does not exist
ac_12 12 11 Does not exist 11
ad_12 12 11 11 Does not exist
如果解决方案不起作用,因为必须重复使用key
对复制no
:
df = (df.groupby(['key', 'no'])['lpm']
.mean()
.unstack(fill_value='Does not exist')
.add_prefix('no_'))
或者:
df = (df.pivot_table(index='key',
columns='no',
values='lpm',
fill_value='Does not exist',
aggfunc='mean').add_prefix('no_'))
编辑:对于后缀,添加add_suffix
:
df = (df.set_index(['key', 'no'])['lpm']
.unstack(fill_value='Does not exist')
.add_prefix('no_')
.add_suffix('_lpm'))
print (df)
no no_1_lpm no_2_lpm no_3_lpm no_4_lpm
key
ab_12 12 11 11 Does not exist
ac_12 12 11 Does not exist 11
ad_12 12 11 11 Does not exist