要实现pytorch的{{1}}类DataSet
方法,它需要支持索引,以便__get_item__()
可用于获取dataset[i]
样本。
说我有一个时间序列ith
:
ser
因为我需要索引到滚动窗口。我使用以下方法生成窗口长度2017-12-29 14:44:00 69.90
2017-12-29 14:45:00 69.91
2017-12-29 14:46:00 69.87
2017-12-29 14:47:00 69.85
2017-12-29 14:48:00 69.86
2017-12-29 14:49:00 69.92
2017-12-29 14:50:00 69.90
2017-12-29 14:51:00 70.00
2017-12-29 14:52:00 69.97
2017-12-29 14:53:00 69.99
2017-12-29 14:54:00 69.99
2017-12-29 14:55:00 69.85
时间序列:
3
l3_list = list()
def t(x):
l3_list.append(x.copy())
ser.rolling(3).apply(t)
变为:
l3_list
这样我就可以在l3_list中建立索引。即[array([69.9 , 69.91, 69.87]),
array([69.91, 69.87, 69.85]),
array([69.87, 69.85, 69.86]),
array([69.85, 69.86, 69.92]),
array([69.86, 69.92, 69.9 ]),
array([69.92, 69.9 , 70. ]),
array([69.9 , 70. , 69.97]),
array([70. , 69.97, 69.99]),
array([69.97, 69.99, 69.99]),
array([69.99, 69.99, 69.85])]
是l3_list[i]
滑动窗口。有没有更有效的内存方式来做到这一点?
答案 0 :(得分:0)
您可以添加一列,有点像在这里完成:Pandas rolling window to return an array
from io import StringIO
data = """
2017-12-29 14:44:00 69.90
2017-12-29 14:45:00 69.91
2017-12-29 14:46:00 69.87
2017-12-29 14:47:00 69.85
2017-12-29 14:48:00 69.86
2017-12-29 14:49:00 69.92
2017-12-29 14:50:00 69.90
2017-12-29 14:51:00 70.00
2017-12-29 14:52:00 69.97
2017-12-29 14:53:00 69.99
2017-12-29 14:54:00 69.99
2017-12-29 14:55:00 69.85
"""
df = pd.read_csv(StringIO(data), sep='\s+', header = None)
stride = np.lib.stride_tricks.as_strided
arr = stride(df[2], (len(df), 3), (df[2].values.strides * 2))
df['array'] = pd.Series(arr.tolist(), index=df.index[:])
0 1 2 array
0 2017-12-29 14:44:00 69.90 [69.9, 69.91, 69.87]
1 2017-12-29 14:45:00 69.91 [69.91, 69.87, 69.85]
2 2017-12-29 14:46:00 69.87 [69.87, 69.85, 69.86]
3 2017-12-29 14:47:00 69.85 [69.85, 69.86, 69.92]
4 2017-12-29 14:48:00 69.86 [69.86, 69.92, 69.9]
5 2017-12-29 14:49:00 69.92 [69.92, 69.9, 70.0]
6 2017-12-29 14:50:00 69.90 [69.9, 70.0, 69.97]
7 2017-12-29 14:51:00 70.00 [70.0, 69.97, 69.99]
8 2017-12-29 14:52:00 69.97 [69.97, 69.99, 69.99]
9 2017-12-29 14:53:00 69.99 [69.99, 69.99, 69.85]
10 2017-12-29 14:54:00 69.99 [69.99, 69.85, 5.53e-322]
11 2017-12-29 14:55:00 69.85 [69.85, 5.53e-322, 5.6e-322]
答案 1 :(得分:0)
这是获取滑动窗口的另一个技巧:
设置:
d = {pd.Timestamp('2017-12-29 14:44:00'): 69.9,
pd.Timestamp('2017-12-29 14:45:00'): 69.91,
pd.Timestamp('2017-12-29 14:46:00'): 69.87,
pd.Timestamp('2017-12-29 14:47:00'): 69.85,
pd.Timestamp('2017-12-29 14:48:00'): 69.86,
pd.Timestamp('2017-12-29 14:49:00'): 69.92,
pd.Timestamp('2017-12-29 14:50:00'): 69.9,
pd.Timestamp('2017-12-29 14:51:00'): 70.0,
pd.Timestamp('2017-12-29 14:52:00'): 69.97,
pd.Timestamp('2017-12-29 14:53:00'): 69.99,
pd.Timestamp('2017-12-29 14:54:00'): 69.99,
pd.Timestamp('2017-12-29 14:55:00'): 69.85}
ser = pd.Series(d)
将空列表与rolling
一起使用,将apply
与append
一起使用:
lol = []
ser.rolling(3).apply((lambda x: lol.append(x.values) or 0), raw=False)
lol
输出:
[array([69.9 , 69.91, 69.87]),
array([69.91, 69.87, 69.85]),
array([69.87, 69.85, 69.86]),
array([69.85, 69.86, 69.92]),
array([69.86, 69.92, 69.9 ]),
array([69.92, 69.9 , 70. ]),
array([69.9 , 70. , 69.97]),
array([70. , 69.97, 69.99]),
array([69.97, 69.99, 69.99]),
array([69.99, 69.99, 69.85])]