我有一系列的摘录:
Dates
1988-01-01 NaN
1988-01-04 257.40
1988-01-05 259.80
1988-01-06 258.60
1988-01-07 262.85
1988-01-08 240.75
1988-01-11 247.70
1988-01-12 246.35
1988-01-13 246.25
1988-01-14 247.45
1988-01-15 251.50
...
2019-03-01 2805.00
2019-03-04 2791.50
2019-03-05 2791.50
2019-03-06 2771.50
2019-03-07 2750.00
2019-03-08 2747.00
2019-03-11 2789.00
2019-03-12 2797.25
2019-03-13 2819.50
2019-03-14 2812.25
2019-03-15 2829.75
Length: 8141, dtype: float64
我需要在工作日之前(即星期一,星期二等)执行该系列的40周移动平均值。
我尝试了几种方法,但只有一种成功了。
werTarget = werTarget.fillna(method='ffill')
i = 0
while i < 5: # for Monday to Friday, do each weekday separately
tmpTarget = werTarget[werTarget.index.weekday==i]
tmpIntmdInd = tmpTarget / tmpTarget.rolling(window=40).mean()
if i == 0:
IntmdInd = tmpIntmdInd
else:
holdindx = IntmdInd
i = i + 1
花了两个多小时才完成,当我绘制它时,每个数据点都是它自己的线。
结果,我需要一个系列,而且速度肯定要快得多:其中一些系列比这更长,而实际上有数千个。
我尝试使用更简洁的内容
werTarget = werTarget.fillna(method='ffill')
IntmdInd = werTarget.groupby('weekday').rolling(window=40).mean()
但这会导致错误
Traceback (most recent call last):
File "<ipython-input-16-1d4ba482ec32>", line 1, in <module>
runfile('C:/MyFile.py', wdir='C:/MyDir')
File "C:\Users\Admin\Anaconda2\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 827, in runfile
execfile(filename, namespace)
File "C:\Users\Admin\Anaconda2\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/MyFile.py", line 62, in <module>
werGraph(sp,werOne)
File "C:/MyFile.py", line 44, in werGraph
IntmdInd = werIntmdInd(werRat)
File "C:/MyFile.py", line 34, in werIntmdInd
IntmdInd = werTarget.groupby('weekday').rolling(window=75).mean()
File "C:\Users\Admin\Anaconda2\lib\site-packages\pandas\core\generic.py", line 7632, in groupby
observed=observed, **kwargs)
File "C:\Users\Admin\Anaconda2\lib\site-packages\pandas\core\groupby\groupby.py", line 2110, in groupby
return klass(obj, by, **kwds)
File "C:\Users\Admin\Anaconda2\lib\site-packages\pandas\core\groupby\groupby.py", line 360, in __init__
mutated=self.mutated)
File "C:\Users\Admin\Anaconda2\lib\site-packages\pandas\core\groupby\grouper.py", line 578, in _get_grouper
raise KeyError(gpr)
KeyError: 'weekday'
有人知道解决方案吗?
答案 0 :(得分:0)
我不确定错误在哪里,因为我几乎使用了问题中的代码。 我将通过pandas_datareader的一些数据进行演示
CREATE TRIGGER TRG_MainTable
ON MainTable
AFTER INSERT AS
BEGIN
INSERT INTO MainTable_BACKUP
SELECT *
FROM INSERTED
-- UPDATE INSERTED SET BackupRecordId = ??? somehow...
END
然后我将索引转换为日期时间,获取工作日,并对分组数据执行滚动平均值
>>> import pandas_datareader as pdr
>>> import pandas as pd # version 0.24.2
>>>
>>> start = pd.to_datetime('2017-01-01')#datetime(2015, 2, 9)
>>> end = pd.to_datetime('2019-01-01')
>>> f = pdr.data.DataReader('F', 'iex', start, end)
>>> f.head()
open high low close volume
date
2017-01-03 10.4286 10.7705 10.3687 10.7619 40510821
2017-01-04 10.9158 11.3432 10.8902 11.2577 77638075
2017-01-05 11.2919 11.3005 10.7961 10.9158 75628443
2017-01-06 10.9414 10.9756 10.8047 10.9072 40315887
2017-01-09 10.9329 10.9927 10.7961 10.7961 39438393