对于每一天,我希望得到的值的平均值在上午8点到下午5点之间。对于那些日平均值,我想为范围周期(例如一个月或一年或自定义选择范围)创建新的平均值。我怎么能在熊猫中做到这一点?
例如,2011年8月至2011年11月期间的平均值,每日上午8点至下午5点。
Time T_Sanyo_Gesloten
2010-08-31 12:30:00 33.910
2010-08-31 12:40:00 33.250
2010-08-31 12:50:00 30.500
2010-08-31 13:00:00 27.065
2010-08-31 13:10:00 25.610
...
2013-06-07 02:10:00 16.970
2013-06-07 02:20:00 16.955
2013-06-07 02:30:00 17.000
2013-06-07 02:40:00 17.015
2013-06-07 02:50:00 16.910
答案 0 :(得分:0)
import datetime as DT
import numpy as np
import pandas as pd
np.random.seed(2013)
N = 10**4
df = pd.DataFrame(
np.cumsum(np.random.random(N) - 0.5),
index=pd.date_range('2010-8-31', freq='10T', periods=N))
# 0
# 2010-08-31 00:00:00 0.175448
# 2010-08-31 00:10:00 0.631796
# 2010-08-31 00:20:00 0.399373
# 2010-08-31 00:30:00 0.499184
# 2010-08-31 00:40:00 0.631005
# ...
# 2010-11-08 09:50:00 -3.474801
# 2010-11-08 10:00:00 -3.172819
# 2010-11-08 10:10:00 -2.988451
# 2010-11-08 10:20:00 -3.101262
# 2010-11-08 10:30:00 -3.477685
eight_to_five = df.ix[df.index.indexer_between_time(DT.time(8), DT.time(17))]
# 0
# 2010-08-31 08:00:00 1.440543
# 2010-08-31 08:10:00 1.450957
# 2010-08-31 08:20:00 1.746454
# 2010-08-31 08:30:00 1.443941
# 2010-08-31 08:40:00 1.845446
# ...
# 2010-11-08 09:50:00 -3.474801
# 2010-11-08 10:00:00 -3.172819
# 2010-11-08 10:10:00 -2.988451
# 2010-11-08 10:20:00 -3.101262
# 2010-11-08 10:30:00 -3.477685
# daily_mean = eight_to_five.groupby()
daily_mean = eight_to_five.resample('D', how='mean')
# 0
# 2010-08-31 0.754004
# 2010-09-01 0.203610
# 2010-09-02 5.219528
# 2010-09-03 6.337688
# 2010-09-04 2.765504
monthly_mean = daily_mean.resample('M', how='mean')
# 0
# 2010-08-31 0.754004
# 2010-09-30 -0.437582
# 2010-10-31 3.533525
# 2010-11-30 4.356728
yearly_mean = daily_mean.groupby(daily_mean.index.year).mean()
# 0
# 2010 1.885995
要获得自定义均值,您需要更改传递给groupby
的参数。