R?中的多重线性回归模型与集成ggplot

时间:2020-06-19 22:09:15

标签: r regression linear-regression tidyverse predict

我正在尝试使用predictLevel Year 2020六月至九月multiple linear regression model。在下面的示例中,我假设将重复2016年的条件并将其用于预测2020年6月至9月的水平。我plot观测到直到May 31的水平,如实线和Forecasted Level显示为蓝色虚线。

library(tidyverse)
library(lubridate)

set.seed(1500)

DF <- data.frame(Date = seq(as.Date("2000-01-01"), to = as.Date("2018-12-31"), by = "days"),
                 Level = runif(6940, 360, 366), Flow = runif(6940, 1,10),
                 PCP = runif(6940, 0,25), MeanT = runif(6940, 1, 30)) %>% 
                  mutate(Year = year(Date), Month = month(Date), Day = day(Date)) %>% 
                  filter(between(Month, 6, 9))
Model <- lm(data = DF, Level~Flow+PCP+MeanT)
Yr_2016 <- DF %>%
  filter(Year == 2016) %>% 
  select(c(3:5)) 
Pred2020 <- data.frame(Date = seq(as.Date("2020-06-01"), to = as.Date("2020-9-30"), by = "days"),
                       Forecast = predict(Model, Yr_2016))
  
Obs2020 <- data.frame(Date = seq(as.Date("2020-01-01"), to = as.Date("2020-05-31"), by = "days"),
                      Level = runif(152, 360, 366))

ggplot(data = Obs2020, aes(x = Date, y = Level), col = "black")+
  geom_line(size = 2)+
  geom_line(data = Pred2020, aes(x = Date, y = Forecast), linetype = "dashed")

enter image description here

我的目标

我想使用fitted model的{​​{1}}到predic的每年6月-9月,假设2020中的所有年份都将重复(而不仅仅是年份) 2016年),然后创建一个DF,其中所有年份plot场景(6月至9月)都以不同的颜色显示-如下所示

enter image description here

1 个答案:

答案 0 :(得分:1)

新答案

下面的代码应该可以满足您的要求(如果我理解正确的话)。但是,该图仍然很混乱。

library(tidyverse)
library(lubridate)

set.seed(1500)

DF <- data.frame(Date = seq(as.Date("2000-01-01"), to = as.Date("2018-12-31"), by = "days"),
                 Level = runif(6940, 360, 366), Flow = runif(6940, 1,10),
                 PCP = runif(6940, 0,25), MeanT = runif(6940, 1, 30)) %>% 
  mutate(Year = year(Date), Month = month(Date), Day = day(Date)) %>% 
  filter(between(Month, 6, 9))

Model <- lm(data = DF, Level ~ Flow + PCP + MeanT)

Obs2020 <- data.frame(Date = seq(as.Date("2020-01-01"),
                                 to = as.Date("2020-05-31"),
                                 by = "days"),
                      Level = runif(152, 362.7, 363.25))
pred_data <- DF %>% 
  nest_by(Year) %>% 
  mutate(pred_df = list(tibble(Date = seq(as.Date("2020-06-01"),
                                          to = as.Date("2020-09-30"),
                                          by = "days"),
                               Forecast = predict(.env$Model, data)))) %>%
  select(Year, pred_df) %>% 
  unnest(pred_df) 

ggplot(data = Obs2020, aes(x = Date, y = Level), col = "black") +
  geom_line(size = 0.1) +
  geom_line(data = pred_data,
            aes(x = Date, y = Forecast, group = factor(Year), color = factor(Year)),
            size = 0.1)

reprex package(v0.3.0)于2020-06-20创建

相关问题