以下代码无效我试图使用我的时间戳数据,以10分钟的间隔聚合它并在熊猫中进行时间序列分析,我在最后一个命令上得到一个值错误,我不知道如何修复< / p>
0 2016-01-01 00:11:52
1 2016-01-01 00:13:00
2 2016-01-01 00:14:49
3 2016-01-01 00:21:00
4 2016-01-01 00:23:05
5 2016-01-01 00:29:00
6 2016-01-01 00:30:00
7 2016-01-01 00:30:36
8 2016-01-01 00:32:00
9 2016-01-01 00:33:00
10 2016-01-01 00:36:40
11 2016-01-01 00:36:55
12 2016-01-01 00:41:00
13 2016-01-01 00:48:17
14 2016-01-01 00:50:49
15 2016-01-01 00:51:00
16 2016-01-01 00:53:00
17 2016-01-01 00:56:00
18 2016-01-01 00:57:00
19 2016-01-01 00:57:35
20 2016-01-01 01:01:00
21 2016-01-01 01:01:37
22 2016-01-01 01:02:07
23 2016-01-01 01:08:32
24 2016-01-01 01:09:00
25 2016-01-01 01:16:00
26 2016-01-01 01:18:47
27 2016-01-01 01:21:00
28 2016-01-01 01:27:34
29 2016-01-01 01:29:07
...
values = series.values
values = values.reshape((len(values), 1))
scaler = MinMaxScaler(feature_range=(0, 1))
scaler = scaler.fit(values)
答案 0 :(得分:1)
假设dtype
是Timestamp
......如果没有,请先执行此操作
series = pd.to_datetime(series)
问题是你需要将这些变成对缩放有意义的数字
我从系列中减去最小日期以获得一系列Timedelta
s。然后找出每个Timedelta
浮点数的总秒数
现在您已为MinMaxScaler
values = series.sub(series.min()).dt.total_seconds().values
values = values.reshape((len(values), 1))
scaler = MinMaxScaler(feature_range=(0, 1))
scaler = scaler.fit(values)
设置调试
from io import StringIO
import pandas as pd
txt = """0 2016-01-01 00:11:52
1 2016-01-01 00:13:00
2 2016-01-01 00:14:49
3 2016-01-01 00:21:00
4 2016-01-01 00:23:05
5 2016-01-01 00:29:00
6 2016-01-01 00:30:00
7 2016-01-01 00:30:36
8 2016-01-01 00:32:00
9 2016-01-01 00:33:00
10 2016-01-01 00:36:40
11 2016-01-01 00:36:55
12 2016-01-01 00:41:00
13 2016-01-01 00:48:17
14 2016-01-01 00:50:49
15 2016-01-01 00:51:00
16 2016-01-01 00:53:00
17 2016-01-01 00:56:00
18 2016-01-01 00:57:00
19 2016-01-01 00:57:35
20 2016-01-01 01:01:00
21 2016-01-01 01:01:37
22 2016-01-01 01:02:07
23 2016-01-01 01:08:32
24 2016-01-01 01:09:00
25 2016-01-01 01:16:00
26 2016-01-01 01:18:47
27 2016-01-01 01:21:00
28 2016-01-01 01:27:34
29 2016-01-01 01:29:07"""
series = pd.read_csv(StringIO(txt),
sep='\s{2,}', header=None,
index_col=0, squeeze=True,
engine='python').rename_axis(None)
series = pd.to_datetime(series)
series.sub(series.min()).dt.total_seconds()
答案 1 :(得分:0)
尝试添加此行:
series = pd.to_datetime(series)
在缩放之前,但它会将您的时间戳变为浮点数。我不认为它可以有意义地逆转