请原谅糟糕的风格和低效的解决方案。非常感谢所有帮助。
背景:
试图在一年的时间内隔离6周以上的最佳循环性能增益。绩效是指一个单车记录在任何给定时间段内产生的最大努力量,即1分钟,5分钟,20分钟等等......
任务:
数据:
ap1 = np.array([[datetime(2015, 10, 17, 12, 45, 13),
datetime(2015, 10, 18, 11, 56, 35),
datetime(2015, 10, 20, 9, 24, 52),
datetime(2015, 10, 23, 9, 27, 12),
datetime(2015, 10, 24, 12, 26, 33)],
[281.0, 343.0, 270.0, 312.0, 320.0],
[246.0, 305.0, 260.0, 283.0, 289.0],
[236.0, 250.0, 239.0, 257.0, 245.0]], dtype=object)
问题:我目前陷入了任务1.我一直试图跟踪user2689410's对不规则时间序列计算Rolling_mean的响应。我希望能抓住他的数据切片方法。
我只想将数据集切成45天的滚动间隔。以下是进展情况:
from pandas import Series, DataFrame
import pandas as pd
from datetime import datetime, timedelta
import numpy as np
idx = ap1[0]
idx = pd.Index(idx)
ap1=np.transpose(ap1)
ap1=pd.DataFrame(ap1, index = idx, columns = ['date', 'cp1', 'cp2', 'cp3'])
ap2=ap1.drop('date', 1)
ap2 = DataFrame(ap2.copy())
idx = Series(ap2.index.to_pydatetime(), index=ap2.index)
for colname, col in ap2.iteritems():
dslice = col[idx-pd.tseries.frequencies.to_offset('42D').delta:idx]
for循环给出了错误:
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/usr/local/lib64/python2.7/site-packages/pandas/core/series.py", line 642, in __getitem__
return self._get_with(key)
File "/usr/local/lib64/python2.7/site-packages/pandas/core/series.py", line 647, in _get_with
indexer = self.index._convert_slice_indexer(key, kind='getitem')
File "/usr/local/lib64/python2.7/site-packages/pandas/indexes/base.py", line 1208, in _convert_slice_indexer
indexer = self.slice_indexer(start, stop, step, kind=kind)
File "/usr/local/lib64/python2.7/site-packages/pandas/tseries/index.py", line 1497, in slice_indexer
return Index.slice_indexer(self, start, end, step, kind=kind)
File "/usr/local/lib64/python2.7/site-packages/pandas/indexes/base.py", line 2962, in slice_indexer
kind=kind)
File "/usr/local/lib64/python2.7/site-packages/pandas/indexes/base.py", line 3141, in slice_locs
start_slice = self.get_slice_bound(start, 'left', kind)
File "/usr/local/lib64/python2.7/site-packages/pandas/indexes/base.py", line 3084, in get_slice_bound
slc = self.get_loc(label)
File "/usr/local/lib64/python2.7/site-packages/pandas/tseries/index.py", line 1419, in get_loc
stamp = Timestamp(key, tz=self.tz)
File "pandas/tslib.pyx", line 405, in pandas.tslib.Timestamp.__new__ (pandas/tslib.c:9932)
File "pandas/tslib.pyx", line 1475, in pandas.tslib.convert_to_tsobject (pandas/tslib.c:26432)
TypeError: Cannot convert input to Timestamp
我从哪里开始?
答案 0 :(得分:0)
如今,pandas.DataFrame.rolling可以处理不规则的时间序列。
答案 1 :(得分:-1)
我找到了一个解决方案,它并不漂亮但是很有效。请提供反馈以提高效率。该解决方案为我提供了与特定列的移动窗口相对应的子阵列阵列
idx = ap2[1]
idx = pd.Index(idx)
ap2 = np.transpose(ap2)
ap2 = pd.DataFrame(ap2, index = idx, columns = ['date', 'cp1', 'cp2', 'cp3'])
ap2=ap2.drop('date', 1)
ap2=ap2.astype(float)
ap2 = DataFrame(ap2.copy())
dfout = DataFrame()
idx = Series(ap2.index.to_pydatetime(), index=ap2.index)
window = '42D'
idxwindow = idx[idx[0]:idx[len(idx)-1]-pd.tseries.frequencies.to_offset(window).delta]
for i in ap2:
exec(i +"= []")
for colname, col in ap2.iteritems():
for i in idxwindow:
result=col[i:i+pd.tseries.frequencies.to_offset(window).delta]
result=np.stack((result.index.date, result.values), axis=-1)
if colname == 'cp1':
cp1.append(result)
elif colname == 'cp2':
cp2.append(result)
else:
cp3.append(result)