我的数据框看起来像这样:
ID Species Count
1 Pine 1000
1 Spruce 1000
2 Pine 2000
3 Pine 1000
3 Spruce 500
3 Birch 500
我想要的是:
Pine Spruce Birch
ID Count Count Count
1 1000 1000
2 2000
3 1000 500 500
所以即时尝试:
a = df.groupby(['ID']).cumcount().astype(str)
newdf = df.set_index(['ID', a]).unstack(fill_value=0).sort_index(level=1, axis=1)
这给了我:
ID Count Species Count Species Count Species
1 1000 Pine 1000 Spruce
2 2000 Pine
3 1000 Pine 500 Spruce 500 Spruce
我该如何解决这个问题?
答案 0 :(得分:2)
简单pivot
df.pivot('ID','Species','Count')
Out[493]:
Species Birch Pine Spruce
ID
1 NaN 1000.0 1000.0
2 NaN 2000.0 NaN
3 500.0 1000.0 500.0
答案 1 :(得分:1)
In [94]: df.set_index(['ID', 'Species'])['Count'].unstack(fill_value=0)
Out[94]:
Species Birch Pine Spruce
ID
1 0 1000 1000
2 0 2000 0
3 500 1000 500