数据包含
timeslot Weather Location Slot
2014-10-26 00:00 35 1 1
2014-10-26 06:00 36 1 2
2014-10-26 12:00 34 1 3
2014-10-26 18:00 34 1 4
2014-10-27 00:00 35 1 1
2014-10-27 06:00 36 1 2
2014-10-27 12:00 36 1 3
2014-10-27 18:00 32 1 4
2014-10-28 00:00 35 1 1
2014-10-28 06:00 33 1 2
2014-10-28 12:00 35 1 3
2014-10-28 18:00 33 1 4
2014-10-26 00:00 45 2 1
2014-10-26 06:00 46 2 2
2014-10-26 12:00 41 2 3
2014-10-26 18:00 39 2 4
2014-10-27 00:00 46 2 1
2014-10-27 06:00 44 2 2
2014-10-27 12:00 45 2 3
2014-10-27 18:00 42 2 4
2014-10-28 00:00 41 2 1
2014-10-28 06:00 40 2 2
2014-10-28 12:00 42 2 3
2014-10-28 18:00 41 2 4
数据包含两个位置点的天气。每天的停留时间转换为6小时的时间段。我想将数据转换为数据透视表。
我尝试的代码是
df.pivot(index='Location', columns='Timeslot', values='weather')
输出应为:
Timeslot 2014-10-26 || 2014-10-27 || 2014-10-28
---------------------------------------------------------------------------
slot 1 2 3 4 || 1 2 3 4 || 1 2 3 4
---------------------------------------------------------------------------
Location
1 35 36 34 34 35 36 32 32 35 33 35 33
2 45 46 41 39 46 44 45 42 41 40 42 41
答案 0 :(得分:1)
将DataFrame.set_index
与DataFrame.unstack
一起使用,对于日期,请使用Series.dt.date
:
df['timeslot'] = pd.to_datetime(df['timeslot'])
df = df.set_index(['Location', df['timeslot'].dt.date, 'Slot'])['Weather'].unstack([1,2])
print (df)
timeslot 2014-10-26 2014-10-27 2014-10-28
Slot 1 2 3 4 1 2 3 4 1 2 3 4
Location
1 35 36 34 34 35 36 36 32 35 33 35 33
2 45 46 41 39 46 44 45 42 41 40 42 41
如果有可能重复组合(三联Location
,日期timeslot
和Slot
),则必须由DataFrame.pivot_table
进行汇总:
df = df.pivot_table(index='Location',
columns=[df['timeslot'].dt.date, 'Slot'],
values='Weather',
aggfunc='mean')