这就是数据框的样子:
Date Time (HHMM) Site Plot Replicate Temperature \
0 2002-05-01 600 Barre Woods 16 5 4.5
1 2002-05-01 600 Barre Woods 21 7 4.5
2 2002-05-01 600 Barre Woods 31 9 6.5
3 2002-05-01 600 Barre Woods 10 2 5.3
4 2002-05-01 600 Barre Woods 2 1 4.0
5 2002-05-01 600 Barre Woods 13 4 5.5
6 2002-05-01 600 Barre Woods 11 3 5.0
7 2002-05-01 600 Barre Woods 28 8 5.0
8 2002-05-01 600 Barre Woods 18 6 4.5
9 2002-05-01 1400 Barre Woods 2 1 10.3
10 2002-05-01 1400 Barre Woods 31 9 9.0
11 2002-05-01 1400 Barre Woods 13 4 11.0
import pandas as pd
import datetime as dt
from datetime import datetime
df=pd.read_csv('F:/data32.csv',parse_dates=['Date'])
df['Date']=pd.to_datetime(df['Date'],format='%m/%d/%y')
这是我收到错误的地方
df2=df.groupby(pd.TimeGrouper(freq='M'))
错误如下:
仅对DatetimeIndex,TimedeltaIndex或PeriodIndex有效,但得到了 'RangeIndex'的实例
答案 0 :(得分:1)
您可以先使用set_index:
dfx = df.set_index('Date')
然后,你可以groupby
:
dfx.groupby(lambda x : x.month).mean() #just for an example I am using .mean()
答案 1 :(得分:1)
按df['Date'].dt.month
分组。例如,要计算平均温度,您可以执行以下操作。
import io
import pandas as pd
data = io.StringIO('''\
Date,Time (HHMM),Site,Plot,Replicate,Temperature
0,2002-05-01,600,Barre Woods,16,5,4.5
1,2002-05-01,600,Barre Woods,21,7,4.5
2,2002-05-01,600,Barre Woods,31,9,6.5
3,2002-05-01,600,Barre Woods,10,2,5.3
4,2002-05-01,600,Barre Woods,2,1,4.0
5,2002-05-01,600,Barre Woods,13,4,5.5
6,2002-05-01,600,Barre Woods,11,3,5.0
7,2002-05-01,600,Barre Woods,28,8,5.0
8,2002-05-01,600,Barre Woods,18,6,4.5
9,2002-05-01,1400,Barre Woods,2,1,10.3
10,2002-05-01,1400,Barre Woods,31,9,9.0
11,2002-05-01,1400,Barre Woods,13,4,11.0
''')
df = pd.read_csv(data)
df['Date'] = pd.to_datetime(df['Date'], format='%Y-%m-%d')
df.groupby(df['Date'].dt.month)['Temperature'].mean()
输出:
Date
5 6.258333
Name: Temperature, dtype: float64