值错误:时间序列数据无法将字符串转换为浮点数

时间:2017-04-05 19:20:12

标签: python pandas machine-learning time-series

以下代码无效我试图使用我的时间戳数据,以10分钟的间隔聚合它并在熊猫中进行时间序列分析,我在最后一个命令上得到一个值错误,我不知道如何修复< / p>

0       2016-01-01 00:11:52
1       2016-01-01 00:13:00
2       2016-01-01 00:14:49
3       2016-01-01 00:21:00
4       2016-01-01 00:23:05
5       2016-01-01 00:29:00
6       2016-01-01 00:30:00
7       2016-01-01 00:30:36
8       2016-01-01 00:32:00
9       2016-01-01 00:33:00
10      2016-01-01 00:36:40
11      2016-01-01 00:36:55
12      2016-01-01 00:41:00
13      2016-01-01 00:48:17
14      2016-01-01 00:50:49
15      2016-01-01 00:51:00
16      2016-01-01 00:53:00
17      2016-01-01 00:56:00
18      2016-01-01 00:57:00
19      2016-01-01 00:57:35
20      2016-01-01 01:01:00
21      2016-01-01 01:01:37
22      2016-01-01 01:02:07
23      2016-01-01 01:08:32
24      2016-01-01 01:09:00
25      2016-01-01 01:16:00
26      2016-01-01 01:18:47
27      2016-01-01 01:21:00
28      2016-01-01 01:27:34
29      2016-01-01 01:29:07
               ...        
values = series.values
values = values.reshape((len(values), 1))
scaler = MinMaxScaler(feature_range=(0, 1))
scaler = scaler.fit(values)

2 个答案:

答案 0 :(得分:1)

假设dtypeTimestamp ......如果没有,请先执行此操作

series = pd.to_datetime(series)

问题是你需要将这些变成对缩放有意义的数字 我从系列中减去最小日期以获得一系列Timedelta s。然后找出每个Timedelta浮点数的总秒数 现在您已为MinMaxScaler

做好准备了
values = series.sub(series.min()).dt.total_seconds().values
values = values.reshape((len(values), 1))
scaler = MinMaxScaler(feature_range=(0, 1))
scaler = scaler.fit(values)

设置调试

from io import StringIO
import pandas as pd

txt = """0       2016-01-01 00:11:52
1       2016-01-01 00:13:00
2       2016-01-01 00:14:49
3       2016-01-01 00:21:00
4       2016-01-01 00:23:05
5       2016-01-01 00:29:00
6       2016-01-01 00:30:00
7       2016-01-01 00:30:36
8       2016-01-01 00:32:00
9       2016-01-01 00:33:00
10      2016-01-01 00:36:40
11      2016-01-01 00:36:55
12      2016-01-01 00:41:00
13      2016-01-01 00:48:17
14      2016-01-01 00:50:49
15      2016-01-01 00:51:00
16      2016-01-01 00:53:00
17      2016-01-01 00:56:00
18      2016-01-01 00:57:00
19      2016-01-01 00:57:35
20      2016-01-01 01:01:00
21      2016-01-01 01:01:37
22      2016-01-01 01:02:07
23      2016-01-01 01:08:32
24      2016-01-01 01:09:00
25      2016-01-01 01:16:00
26      2016-01-01 01:18:47
27      2016-01-01 01:21:00
28      2016-01-01 01:27:34
29      2016-01-01 01:29:07"""

series = pd.read_csv(StringIO(txt),
                     sep='\s{2,}', header=None,
                     index_col=0, squeeze=True,
                     engine='python').rename_axis(None)

series = pd.to_datetime(series)

series.sub(series.min()).dt.total_seconds()

答案 1 :(得分:0)

尝试添加此行:

series = pd.to_datetime(series)
在缩放之前

,但它会将您的时间戳变为浮点数。我不认为它可以有意义地逆转