Keras TimeSeries-具有负值的回归

时间:2018-11-11 08:03:30

标签: keras time-series regression lstm convolution


我正在尝试为时间序列进行回归任务,我的数据如下所示,窗口大小为10,输入特征如下,目标为第5列。如您所见,它的数据为{70,110,-100,540,-130,50}

我的模型如下:

model = Sequential((
    Conv1D(filters=filters, kernel_size=kernel_size, activation='relu',
    input_shape=(window_size, nb_series)),
    MaxPooling1D(),
    Conv1D(filters=filters, kernel_size=kernel_size, activation='relu'),
    MaxPooling1D(),
    Flatten(),
    Dense(nb_outputs, activation='linear'),
))
model.compile(loss='mse', optimizer='adam', metrics=['mae'])

我的输入法的功能如下:

0.00000000,0.42857143,0.57142857,0.00000000,70.00000000,1.00061741,1.00002238,22.40000000,24.85000000,30.75000000,8.10000000,1.00015876,1.00294701,0.99736059,-44.57995000,1.00166700,0.99966561,-0.00003286,0.00030157,1.00252034,49.18000000,40.96386000,19.74918000,-62.22000000
0.00000000,0.09090909,0.72727273,0.18181818,110.00000000,0.99963650,0.99928427,19.19000000,28.89000000,26.65000000,8.60000000,0.99939526,1.00217111,0.99660950,12.04301000,1.00082978,0.99883018,0.00008147,0.00026953,1.00153663,53.70000000,84.81013000,49.33018000,-42.22000000
0.00000000,0.20000000,0.80000000,0.00000000,-100.00000000,1.00034178,1.00016118,19.04000000,27.35000000,36.43000000,9.00000000,1.00028776,1.00300655,0.99756896,-40.34054000,1.00162433,0.99962294,-0.00000094,0.00019842,1.00235166,48.98000000,73.17073000,64.22563000,-62.22000000
0.00000000,0.07407407,0.92592593,0.00000000,540.00000000,0.99554634,0.99608051,20.92000000,32.90000000,20.02000000,12.60000000,0.99583374,0.99957548,0.99209201,166.35514000,0.99723072,0.99523842,0.00069929,0.00025201,0.99342482,67.12000000,89.24051000,83.36000000,-4.23000000
1.00000000,0.30769231,0.53846154,0.15384615,-130.00000000,0.99639984,0.99731696,21.73000000,29.41000000,17.35000000,12.20000000,0.99672034,1.00037538,0.99306530,119.32773000,0.99799071,0.99599723,0.00083646,0.00027643,0.99429023,64.25000000,86.70213000,86.32629000,-13.89000000
1.00000000,0.20000000,0.20000000,0.60000000,50.00000000,0.99590955,0.99698694,24.48000000,37.15000000,15.04000000,12.90000000,0.99618042,1.00005922,0.99230162,123.46570000,0.99737959,0.99538689,0.00105610,0.00034937,0.99368338,66.72000000,87.79070000,86.43382000,-1.39000000


无论有多少个时期,在激活函数,优化器之间切换,我都会得到以下损失。 我了解这是因为我的数据集输出的平均值在122-124之间,这就是为什么我总是得到此值的原因。

297055/297071 [============================>.] - ETA: 0s - loss: 22789.0087 - mean_absolute_error: 123.0670
297071/297071 [==============================] - 144s 486us/step - loss: 22788.9740 - mean_absolute_error: 123.0673 - val_loss: 10519.1722 - val_mean_absolute_error: 79.3461

并通过使用以下代码测试预测:

pred = model.predict(X_test)
print('\n\nactual', 'predicted', sep='\t')
for actual, predicted in zip(y_test, pred.squeeze()):
    print(actual.squeeze(), predicted, sep='\t')

我得到以下输出:
在输出层进行线性激活

20.0    -0.059563223
-22.0   -0.059563223
-55.0   -0.059563223

在输出层激活relu:

235.0 0.0
-170.0 0.0
154.0 0.0

和乙状结肠:

-54.0   1.4216835e-36
-39.0   0.0
66.0    2.0888916e-37

有没有办法像上面那样预测连续整数?

是激活功能吗?

这是特征选择的问题吗?

这是一个体系结构问题,也许LSTM更好?

关于内核大小,过滤器,丢失,激活和优化器的任何建议也非常感谢。

更新: 我尝试通过以下模型使用LSTM:

# design network
model = Sequential()
model.add(LSTM(50, input_shape=(X.shape[1], X.shape[2])))
model.add(Dense(1))
model.compile(loss='mae', optimizer='adam', metrics=['mae'])
# fit network
model.fit(X_train, y_train, epochs=2, batch_size=10, 
validation_data=(X_test, y_test), shuffle=False)

我得到了以下损失:

297071/297071 [==============================] - 196s 661us/step - loss: 122.8202 - mean_absolute_error: 122.8202 - val_loss: 78.2440 - val_mean_absolute_error: 78.2440
Epoch 2/2
297071/297071 [==============================] - 196s 661us/step - loss: 122.3811 - mean_absolute_error: 122.3811 - val_loss: 78.4328 - val_mean_absolute_error: 78.4328

以及以下预测值:

-55.0   -45.222805
-105.0  -21.363165
29.0    -18.858946
-125.0  -34.27912
-134.0  20.847342
-108.0  30.286516
113.0   31.09069
-63.0   8.848535

是架构还是数据?

0 个答案:

没有答案