我试图将列中的值转换为列标题,但保留其余数据。这是我的完整代码,以及最接近我能找到的内容。唯一的问题是,我无法弄清楚如何保留end
列:
import pandas as pd
starts = pd.date_range(start = '1/1/2017', freq = '31d', periods = 4).tolist()
ends = pd.date_range(start = '1/31/2017', freq = '31d', periods = 4).tolist()
df = pd.DataFrame({ 'id':['XXX','XXX','XXX','XXX','YYY','YYY','YYY','YYY'],
'start': starts + starts,
'end': ends + ends,
'type':['car','car','car','car','truck','truck','truck','truck']
}, columns = ['id','start','end','type'])
原始数据框:
id start end type
0 XXX 2017-01-01 2017-01-31 car
1 XXX 2017-02-01 2017-03-03 car
2 XXX 2017-03-04 2017-04-03 car
3 XXX 2017-04-04 2017-05-04 car
4 YYY 2017-01-01 2017-01-31 truck
5 YYY 2017-02-01 2017-03-03 truck
6 YYY 2017-03-04 2017-04-03 truck
7 YYY 2017-04-04 2017-05-04 truck
我最近的当前转轴尝试:
print df.pivot(index = 'start', columns = 'id', values = 'type').reset_index()
当前输出:
id start XXX YYY
0 2017-01-01 car truck
1 2017-02-01 car truck
2 2017-03-04 car truck
3 2017-04-04 car truck
期望的输出:
start end XXX YYY
0 2017-01-01 2017-01-31 car truck
1 2017-02-01 2017-03-03 car truck
2 2017-03-04 2017-04-03 car truck
3 2017-04-04 2017-05-04 car truck
任何帮助都将不胜感激。
答案 0 :(得分:5)
pd.pivot_table(df,index=['start','end'],columns='id',values='type',aggfunc='sum').reset_index()
Out[1587]:
id start end XXX YYY
0 2017-01-01 2017-01-31 car truck
1 2017-02-01 2017-03-03 car truck
2 2017-03-04 2017-04-03 car truck
3 2017-04-04 2017-05-04 car truck
答案 1 :(得分:4)
使用set_index和unstack,
df.set_index(['start', 'end', 'id']).type.unstack().reset_index()
id start end XXX YYY
0 2017-01-01 2017-01-31 car truck
1 2017-02-01 2017-03-03 car truck
2 2017-03-04 2017-04-03 car truck
3 2017-04-04 2017-05-04 car truck