我有一个“日期”列和一个“ tmax”列。我想创建另一列,以显示每年每个月的平均值。我尝试过
df['date']=pd.date_range(start='1/01/1980', end='31/12/2015')
#df['date'] = df['date'].dt.strftime('%d-%b-%Y') format of date
df['means'] = df.resample('M', on='date').mean()
我有
ValueError: Wrong number of items passed 4, placement implies 1
样品值如下:
date tmax
1-Jan-80 15.773
2-Jan-80 18.342
...
30-Jan-80 15.851
31-Jan-80 11.962
...
1-Dec-80 15.773
2-Dec-80 18.342
...
30-Dec-80 15.851
31-Dec-80 11.962
1-Jan-81 15.773
2-Jan-81 18.342
...
30-Jan-81 15.851
31-Jan-81 11.962
...
1-Dec-2015 15.773
2-Dec-2015 18.342
...
30-Dec-2015 15.851
31-Dec-2015 11.962
答案 0 :(得分:0)
为mean
创建单独的数据框,并将其合并为原始数据框,然后回填NaN值。
import numpy as np
# calculate & store `mean` in a new dataframe
df2 = df.set_index('date')['tmax'].resample('M', how=[np.mean]).reset_index(drop=False)
# merge 2 dataframes
pd.merge(df, df2, on='date', how='left').fillna(method='bfill')