ggplot中的预测数据与实际数据之间的差距

时间:2019-02-25 04:28:58

标签: r ggplot2 forecast

我正在尝试以一种很好的ggplot格式绘制一些数据,拟合值和预测,但是当我以我认为应该的方式绘制数据时,我发现实际数据与预测之间存在差距。差距没有意义,但是如果消失了会很好。

您可以用来重现我的问题的一些R代码是:

library(xts)
library(tidyverse)
library(forecast)


dates <- seq(as.Date("2016-01-01"), length = 100, by = "days")

realdata <- arima.sim(model = list(ar = 0.7, order = c(1,1,0)), n = 99)

data <- xts(realdata, order.by = dates)

user_arima <- arima(data, order = c(1,1,0))

user_arimaf <- forecast(user_arima)

fits <- xts(user_arimaf$fitted, order.by = dates)

fcastdates <- as.Date(dates[100]) + 1:10

meancast <- xts(user_arimaf$mean[1:10], order.by = fcastdates)

lowercast95 <- xts(user_arimaf$lower[1:10], order.by = fcastdates)
uppercast95 <- xts(user_arimaf$upper[1:10], order.by = fcastdates)

frame <- merge(data, fits, meancast, uppercast95, lowercast95, all = TRUE, fill = NA)


frame <- as.data.frame(frame) %>% 
  mutate(date = as.Date(dates[1] + 0:(109)))

frame %>% 
  ggplot() +
  geom_line(aes(date, data, color = "Data")) +
  geom_line(aes(date, fits, color = "Fitted")) +
  geom_line(aes(date, meancast, color = "Forecast")) +
  geom_ribbon(aes(date, ymin=lowercast95,ymax=uppercast95),alpha=.25) +
  scale_color_manual(values = c(
    'Data' = 'black',
    'Fitted' = 'red',
    'Forecast' = 'darkblue')) +
  labs(color = 'Legend') +
  theme_classic() +
  ylab("some data") +
  xlab("Date") +
  labs(title = "chart showing a gap",  
       subtitle = "Shaded area is the 95% CI from the ARIMA")

图表在下面

enter image description here

我知道ggplot中现在有一个geom_forecast,但是我想用我的方式来构建这个特殊的图。尽管如果没有其他解决方案,我将使用geom_forecast。

1 个答案:

答案 0 :(得分:2)

要缩小差距,需要在meancast列中为空白区域提供一个数据点。我想仅将值用于最后一个“真实”数据点就有意义。

# Grab the y-value corresponding to the date just before the gap.
last_data_value = frame[frame$date == as.Date("2016-04-09"), "data"]

# Construct a one-row data.frame.
extra_row = data.frame(data=NA_real_, 
                       fits=NA_real_, 
                       meancast=last_data_value,
                       uppercast95=last_data_value,
                       lowercast95=last_data_value,
                       date=as.Date("2016-04-09"))

# Add extra row to the main data.frame.
frame = rbind(frame, extra_row)

enter image description here