我想在管道中应用一年的线性外推法。我想做的事情与此simple example without grouping非常相似。但是在管道内并使用names
。有一些示例like this one,this one或this one。但是我无法获得理想的输出。
可复制的示例:
dplyr::group_by()
我有两个分组类别(“国家”和“实体”),我想使用1990年至1992年的值来使用线性外推法填充1993年的值。 根据{{3}},我可以估算线性模型:
test.frame <- data.frame(Country =
rep(c("Austria", "Brazil", "Canada"), each = 3, times = 3),
Entity = rep(c("CO2","CH4","N2O"), times = 9),
Year = rep(c(1990:1992), each = 9),
value = runif(27, 1,5))
test.frame2 <- data.frame(Country =
rep(c("Austria", "Brazil", "Canada"), each = 3),
Entity = rep(c("CO2","CH4","N2O"), times = 3),
Year = rep(c(1993), each = 3),
value = 0)
results_frame <- test.frame %>%
dplyr::bind_rows(test.frame2)
但是,linear_model <- test.frame %>%
dplyr::group_by(Country, Entity) %>%
lm(value ~ Year, data=.)
results <- predict.lm(linear_model, test.frame2)
没有显示出期望的输出。因此,按照提出的解决方案this,我尝试以下操作:
results
但这不起作用,相反,我得到了results_frame <- test.frame %>%
dplyr::group_by(Country, Entity) %>%
do(lm( value ~ Year , data = test.frame)) %>%
predict.lm(linear_model, test.frame2) %>%
bind_rows(test.frame)
任何帮助将不胜感激!
答案 0 :(得分:2)
您可以使用嵌套的data.frames执行以下操作。此解决方案较为笼统,因为不需要在预测后重新创建"\A[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@
(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\z"
,并且可以有多个自变量:
test.frame2
结果:
library(tidyverse)
test.frame %>%
group_by(Country, Entity) %>%
nest() %>%
inner_join(test.frame2 %>% select(-value) %>% group_by(Country, Entity) %>% nest(),
by = c("Country", "Entity")) %>%
mutate(model = data.x %>% map(~lm(value ~ Year, data=.)),
value = map2(model, data.y, predict)) %>%
select(-data.x, -model) %>%
unnest() %>%
bind_rows(test.frame, .)
答案 1 :(得分:0)
在拟合和预测时,必须小心使用正确的数据:
library(dplyr)
set.seed(42)
test.frame <- data.frame(Country = rep(c("Austria", "Brazil", "Canada"), each = 3, times = 3),
Entity = rep(c("CO2","CH4","N2O"), times = 9),
Year = rep(c(1990:1992), each = 9),
value = runif(27, 1,5))
test.frame %>%
group_by(Country, Entity) %>%
do(lm( value ~ Year , data = .) %>%
predict(., data.frame(Year = 1993)) %>%
data_frame(Year = 1993, value = .)) %>%
bind_rows(test.frame)
#> # A tibble: 36 x 4
#> # Groups: Country, Entity [9]
#> Country Entity Year value
#> <fct> <fct> <dbl> <dbl>
#> 1 Austria CH4 1993 2.10
#> 2 Austria CO2 1993 2.03
#> 3 Austria N2O 1993 6.02
#> 4 Brazil CH4 1993 4.90
#> 5 Brazil CO2 1993 0.771
#> 6 Brazil N2O 1993 5.28
#> 7 Canada CH4 1993 4.69
#> 8 Canada CO2 1993 0.729
#> 9 Canada N2O 1993 1.49
#> 10 Austria CO2 1990 4.66
#> # ... with 26 more rows