我正在尝试为时间序列进行回归任务,我的数据如下所示,窗口大小为10,输入特征如下,目标为第5列。如您所见,它的数据为{70,110,-100,540,-130,50}
我的模型如下:
model = Sequential((
Conv1D(filters=filters, kernel_size=kernel_size, activation='relu',
input_shape=(window_size, nb_series)),
MaxPooling1D(),
Conv1D(filters=filters, kernel_size=kernel_size, activation='relu'),
MaxPooling1D(),
Flatten(),
Dense(nb_outputs, activation='linear'),
))
model.compile(loss='mse', optimizer='adam', metrics=['mae'])
我的输入法的功能如下:
0.00000000,0.42857143,0.57142857,0.00000000,70.00000000,1.00061741,1.00002238,22.40000000,24.85000000,30.75000000,8.10000000,1.00015876,1.00294701,0.99736059,-44.57995000,1.00166700,0.99966561,-0.00003286,0.00030157,1.00252034,49.18000000,40.96386000,19.74918000,-62.22000000 0.00000000,0.09090909,0.72727273,0.18181818,110.00000000,0.99963650,0.99928427,19.19000000,28.89000000,26.65000000,8.60000000,0.99939526,1.00217111,0.99660950,12.04301000,1.00082978,0.99883018,0.00008147,0.00026953,1.00153663,53.70000000,84.81013000,49.33018000,-42.22000000 0.00000000,0.20000000,0.80000000,0.00000000,-100.00000000,1.00034178,1.00016118,19.04000000,27.35000000,36.43000000,9.00000000,1.00028776,1.00300655,0.99756896,-40.34054000,1.00162433,0.99962294,-0.00000094,0.00019842,1.00235166,48.98000000,73.17073000,64.22563000,-62.22000000 0.00000000,0.07407407,0.92592593,0.00000000,540.00000000,0.99554634,0.99608051,20.92000000,32.90000000,20.02000000,12.60000000,0.99583374,0.99957548,0.99209201,166.35514000,0.99723072,0.99523842,0.00069929,0.00025201,0.99342482,67.12000000,89.24051000,83.36000000,-4.23000000 1.00000000,0.30769231,0.53846154,0.15384615,-130.00000000,0.99639984,0.99731696,21.73000000,29.41000000,17.35000000,12.20000000,0.99672034,1.00037538,0.99306530,119.32773000,0.99799071,0.99599723,0.00083646,0.00027643,0.99429023,64.25000000,86.70213000,86.32629000,-13.89000000 1.00000000,0.20000000,0.20000000,0.60000000,50.00000000,0.99590955,0.99698694,24.48000000,37.15000000,15.04000000,12.90000000,0.99618042,1.00005922,0.99230162,123.46570000,0.99737959,0.99538689,0.00105610,0.00034937,0.99368338,66.72000000,87.79070000,86.43382000,-1.39000000
无论有多少个时期,在激活函数,优化器之间切换,我都会得到以下损失。 我了解这是因为我的数据集输出的平均值在122-124之间,这就是为什么我总是得到此值的原因。
297055/297071 [============================>.] - ETA: 0s - loss: 22789.0087 - mean_absolute_error: 123.0670 297071/297071 [==============================] - 144s 486us/step - loss: 22788.9740 - mean_absolute_error: 123.0673 - val_loss: 10519.1722 - val_mean_absolute_error: 79.3461
并通过使用以下代码测试预测:
pred = model.predict(X_test)
print('\n\nactual', 'predicted', sep='\t')
for actual, predicted in zip(y_test, pred.squeeze()):
print(actual.squeeze(), predicted, sep='\t')
我得到以下输出:
在输出层进行线性激活
20.0 -0.059563223 -22.0 -0.059563223 -55.0 -0.059563223
在输出层激活relu:
235.0 0.0 -170.0 0.0 154.0 0.0
和乙状结肠:
-54.0 1.4216835e-36 -39.0 0.0 66.0 2.0888916e-37
有没有办法像上面那样预测连续整数?
是激活功能吗?
这是特征选择的问题吗?
这是一个体系结构问题,也许LSTM更好?
关于内核大小,过滤器,丢失,激活和优化器的任何建议也非常感谢。
更新:
我尝试通过以下模型使用LSTM:
# design network
model = Sequential()
model.add(LSTM(50, input_shape=(X.shape[1], X.shape[2])))
model.add(Dense(1))
model.compile(loss='mae', optimizer='adam', metrics=['mae'])
# fit network
model.fit(X_train, y_train, epochs=2, batch_size=10,
validation_data=(X_test, y_test), shuffle=False)
我得到了以下损失:
297071/297071 [==============================] - 196s 661us/step - loss: 122.8202 - mean_absolute_error: 122.8202 - val_loss: 78.2440 - val_mean_absolute_error: 78.2440 Epoch 2/2 297071/297071 [==============================] - 196s 661us/step - loss: 122.3811 - mean_absolute_error: 122.3811 - val_loss: 78.4328 - val_mean_absolute_error: 78.4328
以及以下预测值:
-55.0 -45.222805 -105.0 -21.363165 29.0 -18.858946 -125.0 -34.27912 -134.0 20.847342 -108.0 30.286516 113.0 31.09069 -63.0 8.848535
是架构还是数据?