我有一个8784 x 13的DF(df2),看起来像这样,其中yyyy-mm-dd格式的“ DATE”列和以下小时的“ TIME”列,我需要每天和每月计算一次2016年的平均值:
DATE TIME BAFFIN BAY GATUN II GATUN I KLONDIKE IIIG \
8759 2016-01-01 0000 8.112838 3.949518 3.291540 7.629178
8760 2016-01-01 0100 7.977169 4.028678 3.097562 7.477159
KLONDIKE II LAGOA II LAGOA I PENASCAL II PENASCAL I SABINA \
8759 7.095450 NaN NaN 8.250527 8.911508 3.835205
8760 7.362562 NaN NaN 7.877099 7.858908 3.766714
SIERRA QUEMADA
8759 3.405049
8760 4.386598
我尝试将'DATE'列转换为datetime以使用groupby,但是我不确定如何执行此操作。我在下面尝试了以下方法,但是当我在Excel中测试计算时,它没有按日或月平均值对数据进行分组:
davg_df2 = df2.groupby(by=df2['DATE'].dt.date).mean() #
davg_df2m = df2.groupby(by=df2['DATE'].dt.month).mean() #
谢谢,因为我仍在学习python,并了解如何使用日期和不同的数据类型!
答案 0 :(得分:2)
尝试一下:
df2['DATE'] = pd.to_datetime(df2['DATE'], format='%Y-%m-%d')
# monthly
davg_df2 = df2.groupby(pd.Grouper(freq='M', key='DATE')).mean()
# daily
davg_df2 = df2.groupby(pd.Grouper(freq='D', key='DATE')).mean()
答案 1 :(得分:0)
# first convert the DATE column to datetime data type:
df2['DATE'] = pd.to_datetime(df2['DATE'])
# create new columns for month and day like so:
df2['month'] = df2['DATE'].apply(lambda t:t.month)
df2['day'] = df2['DATE'].apply(lambda t:t.day)
# then you group by day and month and get the mean like so:
davg_df2m = df2.groupby('month').mean()
davg_df2 = df2.groupby('day').mean()