重置Pandas' Series.rolling'以一天为周期

时间:2016-06-25 20:05:33

标签: python pandas

我使用Pandas分析1分钟的OHLC市场数据,并使用以下内容将包含20个周期(20分钟)移动平均值的列添加到名为“数据”的数据框中:

data['maFast'] = Series.rolling(data['Last'],center=False,window=20).mean() 

我的数据有一个daystart =' 9'和dayend =' 16:14:59'我希望移动平均线在daystart重置每一天。我检查了Series.rolling文档,但没有看到重置选项,请问我该怎么做?

这显示第一天和maFast列按预期显示20个时段后的数据:

                      Open   High   Low    Last  Volume maFast  
Timestamp                           
2014-03-04 09:30:00 1783.50 1784.50 1783.50 1784.50 171 NaN 
2014-03-04 09:31:00 1784.75 1785.75 1784.50 1785.25 28  NaN 
2014-03-04 09:32:00 1785.00 1786.50 1785.00 1786.50 81  NaN 
2014-03-04 09:33:00 1786.00 1786.00 1785.25 1785.25 41  NaN 
2014-03-04 09:34:00 1785.00 1785.25 1784.75 1785.25 11  NaN 
2014-03-04 09:35:00 1785.50 1786.75 1785.50 1785.75 49  NaN 
2014-03-04 09:36:00 1786.00 1786.00 1785.25 1785.75 12  NaN 
2014-03-04 09:37:00 1786.00 1786.25 1785.25 1785.25 15  NaN 
2014-03-04 09:38:00 1785.50 1785.50 1784.75 1785.25 24  NaN 
2014-03-04 09:39:00 1785.50 1786.00 1785.25 1785.25 13  NaN 
2014-03-04 09:40:00 1786.00 1786.25 1783.50 1783.75 28  NaN 
2014-03-04 09:41:00 1784.00 1785.00 1784.00 1784.25 12  NaN 
2014-03-04 09:42:00 1784.25 1784.75 1784.00 1784.25 18  NaN 
2014-03-04 09:43:00 1784.75 1785.00 1784.50 1784.50 10  NaN 
2014-03-04 09:44:00 1784.25 1784.25 1783.75 1784.00 32  NaN 
2014-03-04 09:45:00 1784.50 1784.75 1784.50 1784.75 11  NaN 
2014-03-04 09:46:00 1785.00 1785.00 1784.50 1784.50 11  NaN 
2014-03-04 09:47:00 1785.00 1785.75 1784.75 1785.75 20  NaN 
2014-03-04 09:48:00 1785.75 1786.00 1785.75 1786.00 17  NaN 
2014-03-04 09:49:00 1786.00 1786.50 1785.75 1786.00 13  1785.0875   
2014-03-04 09:50:00 1786.50 1788.75 1786.25 1788.50 307 1785.2875   
2014-03-04 09:51:00 1788.25 1788.25 1787.75 1787.75 17  1785.4125   
2014-03-04 09:52:00 1787.75 1787.75 1787.25 1787.25 11  1785.4500   
2014-03-04 09:53:00 1787.25 1787.50 1787.25 1787.25 11  1785.5500   
2014-03-04 09:54:00 1787.00 1787.50 1786.75 1786.75 26  1785.6250   
2014-03-04 09:55:00 1787.25 1788.25 1787.25 1788.00 11  1785.7375   

第二天有09:30的maFast数据,但我需要每天重置。

                    Open    High    Low Last    Volume  maFast  
Timestamp                           
2014-03-05 09:30:00 1793.25 1794.00 1793.25 1793.25 3   1792.5125   
2014-03-05 09:31:00 1793.50 1793.50 1791.75 1792.25 25  1792.4625   
2014-03-05 09:32:00 1791.50 1791.75 1791.25 1791.75 55  1792.3625

1 个答案:

答案 0 :(得分:4)

以下是一个显示目的为期1小时的示例,但它显示了主要想法:按天分组并在此分组数据框上应用滚动功能

In [62]: df = pd.DataFrame(index=pd.date_range(start='2014-03-04 09:00:00', end='2014-03-04 16:15:00', freq='1h') + pd.date_range(start='2014-03-05 09:00:00', end='2014-03-05 16:15:00', freq='1h'), data={'x': 1})
manage.py:1: FutureWarning: using '+' to provide set union with datetimelike Indexes is deprecated, use .union()
  #!/usr/bin/env python


In [63]: df
Out[63]: 
                     x
2014-03-04 09:00:00  1
2014-03-04 10:00:00  1
2014-03-04 11:00:00  1
2014-03-04 12:00:00  1
2014-03-04 13:00:00  1
2014-03-04 14:00:00  1
2014-03-04 15:00:00  1
2014-03-04 16:00:00  1
2014-03-05 09:00:00  1
2014-03-05 10:00:00  1
2014-03-05 11:00:00  1
2014-03-05 12:00:00  1
2014-03-05 13:00:00  1
2014-03-05 14:00:00  1
2014-03-05 15:00:00  1
2014-03-05 16:00:00  1

In [64]: df.groupby(pd.TimeGrouper('d')).apply(pd.rolling_sum, 3)
Out[64]: 
                      x
2014-03-04 09:00:00 NaN
2014-03-04 10:00:00 NaN
2014-03-04 11:00:00   3
2014-03-04 12:00:00   3
2014-03-04 13:00:00   3
2014-03-04 14:00:00   3
2014-03-04 15:00:00   3
2014-03-04 16:00:00   3
2014-03-05 09:00:00 NaN
2014-03-05 10:00:00 NaN
2014-03-05 11:00:00   3
2014-03-05 12:00:00   3
2014-03-05 13:00:00   3
2014-03-05 14:00:00   3
2014-03-05 15:00:00   3
2014-03-05 16:00:00   3