我在R中使用Keras来预测财务序列。我需要训练MLP有2个隐藏层,每个层有40个神经元来预测股票价格。目标数据由股票价格组成,列车数据由四个价格组成。
input_data
看起来像:
price_lag_4 price_lag_3 price_lag_2 price_lag_1 price
2018-04-13 157.73 161.21 160.28 162.21 161.37
2018-04-16 161.21 160.28 162.21 161.37 162.60
2018-04-17 160.28 162.21 161.37 162.60 166.10
2018-04-18 162.21 161.37 162.60 166.10 166.44
2018-04-19 161.37 162.60 166.10 166.44 164.91
2018-04-20 162.60 166.10 166.44 164.91 162.30
此外,我将数据分成训练和目标集
train_data = input_data["2014::2017",1:4]
train_targets = input_data["2014::2017",5]
并使用min-max标准化
对其进行标准化 train_data = as.matrix(train_data)
train_targets = as.matrix(train_targets)
train_data = (train_data - min(train_data)) / (max(train_data) -
min(train_data))
train_targets = (train_targets - min(train_targets)) /
(max(train_targets) - min(train_targets))
然后我在输入层构建了4个神经元的MLP,2个隐藏层,每个神经元有40个神经元,输出层有1个神经元。然后我适合它:
validation_split = 0.05
model = keras_model_sequential() %>%
layer_dense(units = 40, activation = "relu", input_shape =
dim(train_data)[2]) %>%
layer_dense(units = 40, activation = "relu") %>%
layer_dense(units = 1, activation = "relu")
model %>% compile(optimizer = optimizer_sgd(), loss = "mse", metrics =
c("mae"))
fit(x = train_data, y = train_targets, epochs = 60, batch_size = 32,
validation_split = validation_split)
拟合已融合:
Trained on 956 samples, validated on 51 samples (batch_size=32, epochs=60)
Final epoch (plot to see history):
val_loss: 0.0004162
val_mean_absolute_error: 0.0159
loss: 0.0002706
mean_absolute_error: 0.01215
进一步预测我将在2018年使用价格
validation_data = input_data["2018",1:4]
tail(validation_data)
price_lag_4 price_lag_3 price_lag_2 price_lag_1
2018-04-13 157.73 161.21 160.28 162.21
2018-04-16 161.21 160.28 162.21 161.37
2018-04-17 160.28 162.21 161.37 162.60
2018-04-18 162.21 161.37 162.60 166.10
2018-04-19 161.37 162.60 166.10 166.44
2018-04-20 162.60 166.10 166.44 164.91
prediction_sgd = predict(object = model, x = validation_data)
tail(prediction_sgd)
[,1]
[71,] 147.2574
[72,] 148.6506
[73,] 148.6407
[74,] 149.8874
[75,] 150.8464
[76,] 151.8221
预测在某种程度上接近价格
validation_targets = prices["2018"]
tail(validation_targets)
[,1]
2018-04-13 161.37
2018-04-16 162.60
2018-04-17 166.10
2018-04-18 166.44
2018-04-19 164.91
2018-04-20 162.30
所以,这个MLP架构以某种方式工作,但当我将激活函数更改为tanh
时,模型变为:
validation_split = 0.05
model = keras_model_sequential() %>%
layer_dense(units = 40, activation = "tanh", input_shape =
dim(train_data)[2]) %>%
layer_dense(units = 40, activation = "tanh") %>%
layer_dense(units = 1)
model %>% compile(optimizer = optimizer_sgd(), loss = "mse", metrics =
c("mae"))
history = model %>% fit(x = train_data, y = train_targets, epochs = 60,
batch_size = 32, validation_split = validation_split)
Trained on 956 samples, validated on 51 samples (batch_size=32,
epochs=60)
Final epoch (plot to see history):
val_loss: 0.0306
val_mean_absolute_error: 0.1728
loss: 0.001923
mean_absolute_error: 0.0343
我得到了奇怪的预测:
prediction_sgd = predict(object = model, x = validation_data)
tail(prediction_sgd)
[,1]
[71,] 0.9751762
[72,] 0.9749264
[73,] 0.9750333
[74,] 0.9750219
[75,] 0.9747972
[76,] 0.9749493
当我使用sigmoid
传递函数时,我也会得到奇怪的预测
因此,我有以下问题:
1)为什么预测数据在第二种情况下如此奇怪?我做错了吗?
2)我是否需要规范化目标数据,即在y
函数中输入fit
?
答案 0 :(得分:0)
问题是tanh和sigmoid的输出范围分别为[-1,1]和[0,1]。因此,如果y不在该范围内,则网络无法学习,因为它无法预测这些值。它会尝试预测网络可能的高度,但这只是1.
因此,您需要使用一个函数作为最终激活,该函数能够生成所需输出范围内的值,例如:线性激活或ReLU。但这不适用于中间层。