我在熊猫数据框中有一个规则间隔的时间序列:
1998-01-01 00:00:00 5.71
1998-01-01 12:00:00 5.73
1998-01-02 00:00:00 5.68
1998-01-02 12:00:00 5.69 ...
我还有不规则间隔的日期列表:
1998-01-01
1998-07-05
1998-09-21 ....
我想计算日期列表每个时间间隔之间的时间序列平均值。使用pandas.DataFrame.resample以某种方式可能吗?如果没有,最简单的方法是什么?
编辑: 例如,计算由以下代码创建的“日期”中各日期之间的“系列”平均值:
import pandas as pd
import numpy as np
import datetime
rng = pd.date_range('1998-01-01', periods=365, freq='D')
series = pd.DataFrame(np.random.randn(len(rng)), index=rng)
dates = [pd.Timestamp('1998-01-01'), pd.Timestamp('1998-07-05'), pd.Timestamp('1998-09-21')]
答案 0 :(得分:0)
您可以像这样迭代日期:
for ti in range(1,len(dates)):
start_date,end_date=dates[ti-1],dates[ti]
mask=(series.index > start_date) & (series.index <= end_date)
print(series[mask].mean())
答案 1 :(得分:0)
您可以循环浏览日期,并仅选择位于这些日期之间的行,
import pandas as pd
import numpy as np
import datetime
rng = pd.date_range('1998-01-01', periods=365, freq='D')
series = pd.DataFrame(np.random.randn(len(rng)), index=rng)
dates = [pd.Timestamp('1998-01-01'), pd.Timestamp('1998-07-05'), pd.Timestamp('1998-09-21')]
for i in range(len(dates)-1):
start = dates[i]
end = dates[i+1]
sample = series.loc[(series.index > start) & (series.index <= end)]
print(f'Mean value between {start} and {end} : {sample.mean()[0]}')
# Output
Mean value between 1998-01-01 00:00:00 and 1998-07-05 00:00:00 : -0.024342221543215112
Mean value between 1998-07-05 00:00:00 and 1998-09-21 00:00:00 : 0.13945008064765074
除了循环,您还可以使用这样的列表理解,
[series.loc[(series.index > dates[i]) & (series.index <= dates[i+1])].mean()[0] for i in range(len(dates) - 1) ] # [-0.024342221543215112, 0.13945008064765074]