我试图看到反复神经计算的力量。
我只给NN一个特征,过去的一个时间序列数据,并预测当前的数据。
然而,时间序列是双季节性的,具有相当长的ACF结构(约64个),延迟时间越短,延迟时间越短。
你可以注意到它被转移了。我检查了我的载体,它们似乎没问题。
MSE残差也非常糟糕(由于高斯噪声加上sigma = 0.1,我预计在两次列车验证时都是0.01):
> head(x_train)
[1] 0.9172955 0.9285578 0.4046166 -0.4144658 -0.3121450 0.3958689
> head(y_train)
[,1]
[1,] 0.9285578
[2,] 0.4046166
[3,] -0.4144658
[4,] -0.3121450
[5,] 0.3958689
[6,] 1.5823631
问:我在LSTM架构方面做错了吗,我的代码在采样数据方面是错误的吗?
以下代码假定您已安装了所有列出的库。
library(keras)
library(data.table)
library(ggplot2)
# ggplot common theme -------------------------------------------------------------
ggplot_theme <- theme(
text = element_text(size = 16) # general text size
, axis.text = element_text(size = 16) # changes axis labels
, axis.title = element_text(size = 18) # change axis titles
, plot.title = element_text(size = 20) # change title size
, axis.text.x = element_text(angle = 90, hjust = 1)
, legend.text = element_text(size = 16)
, strip.text = element_text(face = "bold", size = 14, color = "grey17")
, panel.background = element_blank() # remove background of chart
, panel.grid.minor = element_blank() # remove minor grid marks
)
# constants
features <- 1
timesteps <- 1
x_diff <- sin(seq(0.1, 100, 0.1)) + sin(seq(1, 1000, 1)) + rnorm(1000, 0, 0.1)
#x_diff <- ((x_diff - min(x_diff)) / (max(x_diff) - min(x_diff)) - 0.5) * 2
# generate training data
train_list <- list()
train_y_list <- list()
for(
i in 1:(length(x_diff) / 2 - timesteps)
)
{
train_list[[i]] <- x_diff[i:(timesteps + i - 1)]
train_y_list[[i]] <- x_diff[timesteps + i]
}
x_train <- unlist(train_list)
y_train <- unlist(train_y_list)
x_train <- array(x_train, dim = c(length(train_list), timesteps, features))
y_train <- matrix(y_train, ncol = 1)
# generate validation data
val_list <- list()
val_y_list <- list()
for(
i in (length(x_diff) / 2):(length(x_diff) - timesteps)
)
{
val_list[[i - length(x_diff) / 2 + 1]] <- x_diff[i:(timesteps + i - 1)]
val_y_list[[i - length(x_diff) / 2 + 1]] <- x_diff[timesteps + i]
}
x_val <- unlist(val_list)
y_val <- unlist(val_y_list)
x_val <- array(x_val, dim = c(length(val_list), timesteps, features))
y_val <- matrix(y_val, ncol = 1)
## lstm (stacked) ----------------------------------------------------------
# define and compile model
# expected input data shape: (batch_size, timesteps, features)
fx_model <-
keras_model_sequential() %>%
layer_lstm(
units = 32
#, return_sequences = TRUE
, input_shape = c(timesteps, features)
) %>%
#layer_lstm(units = 16, return_sequences = TRUE) %>%
#layer_lstm(units = 16) %>% # return a single vector dimension 16
#layer_dropout(rate = 0.5) %>%
layer_dense(units = 4, activation = 'tanh') %>%
layer_dense(units = 1, activation = 'linear') %>%
compile(
loss = 'mse',
optimizer = 'RMSprop',
metrics = c('mse')
)
# train
# early_stopping <-
# callback_early_stopping(
# monitor = 'val_loss'
# , patience = 10
# )
history <-
fx_model %>%
fit(
x_train, y_train, batch_size = 50, epochs = 100, validation_data = list(x_val, y_val)
)
plot(history)
## plot predict
fx_predict <- data.table(
forecast = as.numeric(predict(
fx_model
, x_val
))
, fact = as.numeric(y_val[, 1])
, timestep = 1:length(x_diff[(length(x_diff) / 2):(length(x_diff) - timesteps)])
)
fx_predict_melt <- melt(fx_predict
, id.vars = 'timestep'
, measure.vars = c('fact', 'forecast')
)
ggplot(
fx_predict_melt[timestep < 301, ]
, aes(x = timestep
, y = value
, group = variable
, color = variable)
) +
geom_line(
alpha = 0.95
, size = 1
) +
ggplot_theme
答案 0 :(得分:2)
总是很难看到它,只是说出了什么问题,但这里有一些你可以尝试的事情。
修改强> 输入数据的窗口是需要调整的另一个参数。回顾只有1(至少从2开始),网络不能轻易找到模式,除非它们过于简单。模式越复杂,您想要在一定程度上输入的窗口就越多。
答案 1 :(得分:0)
在我看来,它与此处发布的问题非常相似: stock prediction : GRU model predicting same given values instead of future stock price
正如对该问题的回答所述,我相信,如果您尝试预测样本值之间的差异而不是直接预测样本值,将会开始看到模型的局限性。当直接预测样本值时,该模型可以轻松地认识到,使用先前的值作为预测因子非常有助于最小化MSE,因此,您将获得1步滞后的结果。