我正在尝试使用python中的Keras / Theanos后端对时间序列数据进行预测,其中我将过去几天的因素用于预测第二天。我能够使用其他算法(如xgboost)生成预测但是想尝试ANN但在预测步骤中遇到了索引越界错误
部分代码是这样的:
clfnn = Sequential()
clfnn.add(Dense(32, input_dim=9,init='uniform',activation='tanh'))
clfnn.add(Dense(9,init='uniform',activation='tanh'))
clfnn.add(Dense(1,activation='tanh'))
clfnn.compile(loss='mse', optimizer=sgd, metrics=['accuracy'])
imp = Imputer(missing_values='NaN', strategy='mean', axis=0)
imp.fit(testsample[factorsnew])
testsample[factorsnew]=imp.transform(testsample[factorsnew])
validationsample[factorsnew]=imp.transform(validationsample[factorsnew])
models={'prediction':clfnn}
for key in models:
try:
models[key].fit(testsample[factorsnew].as_matrix(),testsample['returnranksnn'].as_matrix(),verbose=1)
validationsample[key]=models[key].predict_proba(validationsample[factorsnew].as_matrix(),verbose=1)[:,1]
except:
print sys.exc_info()[0]
print sys.exc_info()[1]
pass
该模型似乎没有任何问题,但预测步骤给出了错误。输出看起来像这样:
Epoch 1/10
32240/32240 [==============================] - 0s - loss: 0.2506 - acc: 0.5980
Epoch 2/10
32240/32240 [==============================] - 0s - loss: 0.2504 - acc: 0.6054
Epoch 3/10
32240/32240 [==============================] - 0s - loss: 0.2504 - acc: 0.6069
Epoch 4/10
32240/32240 [==============================] - 0s - loss: 0.2505 - acc: 0.6028
Epoch 5/10
32240/32240 [==============================] - 0s - loss: 0.2504 - acc: 0.6015
Epoch 6/10
32240/32240 [==============================] - 0s - loss: 0.2503 - acc: 0.6067
Epoch 7/10
32240/32240 [==============================] - 0s - loss: 0.2504 - acc: 0.6020
Epoch 8/10
32240/32240 [==============================] - 0s - loss: 0.2505 - acc: 0.5999
Epoch 9/10
32240/32240 [==============================] - 0s - loss: 0.2504 - acc: 0.6040
Epoch 10/10
32240/32240 [==============================] - 0s - loss: 0.2505 - acc: 0.6024
32/40 [=======================>......] - ETA: 0s<type 'exceptions.IndexError'>
index 1 is out of bounds for axis 1 with size 1
注意:数据在没有任何NaN的情况下被标准化,预测变量是int类型,只有两个结果0或1,结果应该只是一个概率数
尝试更改优化器的设置,数据中的不同因素但是徒劳无功。如您在输出中看到的,大多数样品都会停留在 32 / 40或 32 / ***。关于我在这里缺少什么的想法?感谢