Question

我有此数据：

Date,Open,High,Low,Close,Adj Close,Volume
2007-01-03,12.160000,12.750000,11.530000,12.040000,12.040000,0
2007-01-04,12.400000,12.420000,11.280000,11.510000,11.510000,0
2007-01-05,11.840000,12.250000,11.680000,12.140000,12.140000,0
2007-01-08,12.480000,12.830000,11.780000,12.000000,12.000000,0
2007-01-09,11.860000,12.470000,11.690000,11.910000,11.910000,0
2007-01-10,12.340000,12.500000,11.430000,11.470000,11.470000,0
2007-01-11,11.420000,11.480000,10.500000,10.870000,10.870000,0
2007-01-12,10.930000,10.930000,10.140000,10.150000,10.150000,0
2007-01-16,10.640000,10.890000,10.400000,10.740000,10.740000,0
2007-01-17,10.900000,10.900000,10.350000,10.590000,10.590000,0
2007-01-18,10.650000,11.040000,10.450000,10.850000,10.850000,0
2007-01-19,10.800000,11.030000,10.240000,10.400000,10.400000,0
2007-01-22,10.770000,11.080000,10.620000,10.770000,10.770000,0
2007-01-23,10.770000,10.940000,10.220000,10.340000,10.340000,0

我有这段代码可以进行一些时间序列预测

import pandas as pd
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import r2_score
from keras.models import Sequential
from keras.layers import Dense
from keras.callbacks import EarlyStopping
from keras.optimizers import Adam
from keras.layers import LSTM
df = pd.read_csv("^VIX.csv")
df.drop(['Open', 'High', 'Low', 'Close', 'Volume'], axis=1, inplace=True)
df['Date'] = pd.to_datetime(df['Date'])
df = df.set_index(['Date'], drop=True)
split_date = pd.Timestamp('2016-01-01')
df =  df['Adj Close']
train = df.loc[:split_date]
test = df.loc[split_date:]
plt.figure(figsize=(10, 6))
ax = train.plot()
test.plot(ax=ax)
plt.legend(['train', 'test']);

到目前为止很好，但是在运行时

# scale train and test data to [-1, 1]
scaler = MinMaxScaler(feature_range=(-1, 1))
train_sc = scaler.fit_transform(train)
test_sc = scaler.transform(test)

我遇到错误：

ValueError：预期为2D数组，而改为1D数组：array = [12.04
11.51 12.14 ... 16.08 17.290001 18.209999]。如果数据只有一个，则使用array.reshape（-1，1）重塑数据 feature或array.reshape（1，-1）（如果它包含单个样本）。

尽管这是原始代码，但看起来重塑未正确完成，而numpay重塑却丢失了一些东西。

我应该在重塑中解决什么？谢谢！

Answer 1

针对您的特殊情况的解决方案：

train_sc = scaler.fit_transform(train.values.reshape(-1, 1))
test_sc = scaler.transform(test.values.reshape(-1, 1))

Answer 2

这很容易：test为空，因此scaler.transform失败。更改：

split_date = pd.Timestamp('2016-01-01')

例如

split_date = pd.Timestamp('2007-01-10')

看到它正常工作。

如何使用Numpy重塑？

2 个答案: