在pandas中分组之后分配日期时间

时间:2016-01-28 05:28:38

标签: python pandas

我正在做一个气候学,即在具有多年日常数据和日期时间作为索引的数据框中对此进行平均:

df.groupby([df.index.month, df.index.day]).mean()

一旦我执行groupby,日期时间索引就会消失。这是有道理的,因为groupby之后的每一行都没有唯一的日期时间。

有没有办法在groupby完成后通过人为分配一年来重新引入日期时间?

- 编辑数据帧:

datetime    val1    val2
1/1/2000    74.25769    5.813470958
1/2/2000    74.25769    5.813470958
1/3/2000    74.25769    5.813470958
1/4/2000    74.25769    5.813470958
1/5/2000    76.67728083 5.813470958
1/6/2000    76.67728083 5.813470958
1/7/2000    76.67728083 5.813470958
1/4/2001    76.67728083 5.813470958
1/5/2001    77.30620917 12.3357252
1/6/2001    77.30620917 12.3357252
1/7/2001    77.30620917 12.3357252
1/8/2001    77.30620917 12.3357252
1/9/2001    77.30620917 12.3357252
1/10/2001   77.30620917 12.3357252

1 个答案:

答案 0 :(得分:2)

IIUC您丢失了year个信息,但您可以groupby map之后使用自定义year monthsdays来设置{} index

import datetime

df = df.groupby([df.index.month, df.index.day]).mean()
print df
           val1       val2
1 1   74.257690   5.813471
  2   74.257690   5.813471
  3   74.257690   5.813471
  4   75.467485   5.813471
  5   76.991745   9.074598
  6   76.991745   9.074598
  7   76.991745   9.074598
  8   77.306209  12.335725
  9   77.306209  12.335725
  10  77.306209  12.335725

df['Date'] = df.index.map(lambda x: datetime.date(2000, x[0], x[1]))
print df.set_index('Date')
                 val1       val2
Date                            
2000-01-01  74.257690   5.813471
2000-01-02  74.257690   5.813471
2000-01-03  74.257690   5.813471
2000-01-04  75.467485   5.813471
2000-01-05  76.991745   9.074598
2000-01-06  76.991745   9.074598
2000-01-07  76.991745   9.074598
2000-01-08  77.306209  12.335725
2000-01-09  77.306209  12.335725
2000-01-10  77.306209  12.335725