我想创建一个函数,使因变量(y)与各个自变量(x1,x2等)回归,但不以多元回归的形式回归。我想在同一公式中包含另一个函数是计算AIC值。因此,这两个函数都使用相同的公式。有人知道如何做吗?我有一个庞大的数据集,我需要为具有多个自变量的单个因变量找到一个回归。如果有人在这里指导我,我将不胜感激。
答案 0 :(得分:0)
以下代码将为您提供因变量(y)与各个自变量回归的结果
data(mtcars)
x = names(mtcars[,-1])
out <- unlist(lapply(1, function(n) combn(x, 1, FUN=function(row) paste0("mpg ~ ", paste0(row, collapse = "+")))))
out
#> [1] "mpg ~ cyl" "mpg ~ disp" "mpg ~ hp" "mpg ~ drat" "mpg ~ wt"
#> [6] "mpg ~ qsec" "mpg ~ vs" "mpg ~ am" "mpg ~ gear" "mpg ~ carb"
library(broom)
#> Warning: package 'broom' was built under R version 3.5.3
library(dplyr)
#> Warning: package 'dplyr' was built under R version 3.5.3
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
#To have the regression coefficients
tmp1 = bind_rows(lapply(out, function(frml) {
a = tidy(lm(frml, data=mtcars))
a$frml = frml
return(a)
}))
head(tmp1)
#> # A tibble: 6 x 6
#> term estimate std.error statistic p.value frml
#> <chr> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 (Intercept) 37.9 2.07 18.3 8.37e-18 mpg ~ cyl
#> 2 cyl -2.88 0.322 -8.92 6.11e-10 mpg ~ cyl
#> 3 (Intercept) 29.6 1.23 24.1 3.58e-21 mpg ~ disp
#> 4 disp -0.0412 0.00471 -8.75 9.38e-10 mpg ~ disp
#> 5 (Intercept) 30.1 1.63 18.4 6.64e-18 mpg ~ hp
#> 6 hp -0.0682 0.0101 -6.74 1.79e- 7 mpg ~ hp
#To have the regression results i.e. R2, AIC, BIC
tmp2 = bind_rows(lapply(out, function(frml) {
a = glance(lm(frml, data=mtcars))
a$frml = frml
return(a)
}))
head(tmp2)
#> # A tibble: 6 x 12
#> r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC
#> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl>
#> 1 0.726 0.717 3.21 79.6 6.11e-10 2 -81.7 169. 174.
#> 2 0.718 0.709 3.25 76.5 9.38e-10 2 -82.1 170. 175.
#> 3 0.602 0.589 3.86 45.5 1.79e- 7 2 -87.6 181. 186.
#> 4 0.464 0.446 4.49 26.0 1.78e- 5 2 -92.4 191. 195.
#> 5 0.753 0.745 3.05 91.4 1.29e-10 2 -80.0 166. 170.
#> 6 0.175 0.148 5.56 6.38 1.71e- 2 2 -99.3 205. 209.
#> # ... with 3 more variables: deviance <dbl>, df.residual <int>, frml <chr>
write.csv(tmp1, "Try_lm_coefficients.csv")
write.csv(tmp2, "Try_lm_results.csv")
由reprex package(v0.3.0)于2019-11-11创建
结果可在“ Try_lm_coefficients.csv”和“ Try_lm_results.csv”文件中找到。