从pandas中的日期明智列数据框创建周智能数据框

时间:2017-10-06 07:00:07

标签: python pandas

我有这样的数据

                                          Average    Std         Rank
Index           
('East', 'Mid', 'Equities', '2017/09/01')   7.1      2.3            5
('East', 'Mid', 'Equities', '2017/09/04')   6.4      4.2           14
('West', 'Mid', 'Equities', '2017/09/05')   6.3      4.3           16
('East', 'Mid', 'Equities', '2017/09/06')   4        1.8           18

我需要按周分组才能使它看起来像这样

Week-1                            Average  Std     Rank

East Mid Equities 2017/09/04       6.4      4.2     14  
West Mid Equities 2017/09/05       6.3      4.3     16     

Week-2
East Mid Equities 2017/09/12       8.1      1.7    25

等等。

列average,std和rank是从其他一些数据帧派生的。我只需要根据周数(1-4)对日期进行分组,因为它是月度数据。所以我需要在这里添加第1周,第2周等作为索引 哪些功能可以帮助我生成这样的数据帧?提前致谢

1 个答案:

答案 0 :(得分:1)

使用resample weekly frequency starting in Monday聚合:

df.index = pd.to_datetime(df.index)
df = df.resample('W-MON').agg({'Average':'mean', 'Std':'std'})
print (df)
            Average       Std
Date                         
2017-09-04     6.75  1.343503
2017-09-11     5.15  1.767767

编辑:

print (df.index)
#MultiIndex(levels=[['East', 'West'], ['Mid'], ['Equities'], 
#                   ['2017/09/01', '2017/09/04', '2017/09/05', '2017/09/06']],
#           labels=[[0, 0, 1, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 1, 2, 3]])

#set MultiIndex level names for later groupby
df.index.names = ('a','b','c','date')
#create DatetimeIndex
df = df.reset_index(level=[0,1,2])
df.index = pd.to_datetime(df.index)

#aggreagte, rank has to be aggregate some method like mean, sum, 
#because rank of ranks has no sense
d = {'Average':'mean', 'Std':'std', 'Rank': 'mean'}
df = df.groupby(['a','b','c']).resample('W-MON').agg(d)
print (df)
                              Average       Std  Rank
a    b   c        date                               
East Mid Equities 2017-09-04     6.75  1.343503    19
                  2017-09-11     4.00       NaN    18
West Mid Equities 2017-09-11     6.30       NaN    16