数据到数据透视表

时间:2019-07-18 13:06:43

标签: python pandas

数据包含

timeslot        Weather  Location      Slot 
2014-10-26 00:00    35     1             1
2014-10-26 06:00    36     1             2
2014-10-26 12:00    34     1             3
2014-10-26 18:00    34     1             4
2014-10-27 00:00    35     1             1
2014-10-27 06:00    36     1             2
2014-10-27 12:00    36     1             3
2014-10-27 18:00    32     1             4
2014-10-28 00:00    35     1             1
2014-10-28 06:00    33     1             2
2014-10-28 12:00    35     1             3
2014-10-28 18:00    33     1             4
2014-10-26 00:00    45     2             1
2014-10-26 06:00    46     2             2
2014-10-26 12:00    41     2             3
2014-10-26 18:00    39     2             4
2014-10-27 00:00    46     2             1
2014-10-27 06:00    44     2             2
2014-10-27 12:00    45     2             3
2014-10-27 18:00    42     2             4
2014-10-28 00:00    41     2             1
2014-10-28 06:00    40     2             2
2014-10-28 12:00    42     2             3
2014-10-28 18:00    41     2             4

数据包含两个位置点的天气。每天的停留时间转换为6小时的时间段。我想将数据转换为数据透视表。

我尝试的代码是

df.pivot(index='Location', columns='Timeslot', values='weather')

输出应为:

 Timeslot           2014-10-26    ||      2014-10-27    ||     2014-10-28
---------------------------------------------------------------------------
  slot           1    2   3    4  ||  1    2   3    4   ||   1    2   3    4
---------------------------------------------------------------------------
Location
    1           35   36  34   34     35   36   32  32       35   33   35   33 
    2           45   46  41   39     46   44   45  42       41   40   42   41


1 个答案:

答案 0 :(得分:1)

DataFrame.set_indexDataFrame.unstack一起使用,对于日期,请使用Series.dt.date

df['timeslot'] = pd.to_datetime(df['timeslot'])
df = df.set_index(['Location', df['timeslot'].dt.date, 'Slot'])['Weather'].unstack([1,2])
print (df)
timeslot 2014-10-26             2014-10-27             2014-10-28            
Slot              1   2   3   4          1   2   3   4          1   2   3   4
Location                                                                     
1                35  36  34  34         35  36  36  32         35  33  35  33
2                45  46  41  39         46  44  45  42         41  40  42  41

如果有可能重复组合(三联Location,日期timeslotSlot),则必须由DataFrame.pivot_table进行汇总:

df = df.pivot_table(index='Location', 
                    columns=[df['timeslot'].dt.date, 'Slot'],
                    values='Weather', 
                    aggfunc='mean')