具有多个自变量的因变量之间的线性回归

时间:2019-11-11 09:38:55

标签: r linear-regression

我想创建一个函数,使因变量(y)与各个自变量(x1,x2等)回归,但不以多元回归的形式回归。我想在同一公式中包含另一个函数是计算AIC值。因此,这两个函数都使用相同的公式。有人知道如何做吗?我有一个庞大的数据集,我需要为具有多个自变量的单个因变量找到一个回归。如果有人在这里指导我,我将不胜感激。

1 个答案:

答案 0 :(得分:0)

以下代码将为您提供因变量(y)与各个自变量回归的结果

data(mtcars)

x = names(mtcars[,-1])
out <- unlist(lapply(1, function(n) combn(x, 1, FUN=function(row) paste0("mpg ~ ", paste0(row, collapse = "+")))))
out
#>  [1] "mpg ~ cyl"  "mpg ~ disp" "mpg ~ hp"   "mpg ~ drat" "mpg ~ wt"  
#>  [6] "mpg ~ qsec" "mpg ~ vs"   "mpg ~ am"   "mpg ~ gear" "mpg ~ carb"
library(broom)
#> Warning: package 'broom' was built under R version 3.5.3
library(dplyr)
#> Warning: package 'dplyr' was built under R version 3.5.3
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

#To have the regression coefficients
tmp1 = bind_rows(lapply(out, function(frml) {
  a = tidy(lm(frml, data=mtcars))
  a$frml = frml
  return(a)
}))
head(tmp1)
#> # A tibble: 6 x 6
#>   term        estimate std.error statistic  p.value frml      
#>   <chr>          <dbl>     <dbl>     <dbl>    <dbl> <chr>     
#> 1 (Intercept)  37.9      2.07        18.3  8.37e-18 mpg ~ cyl 
#> 2 cyl          -2.88     0.322       -8.92 6.11e-10 mpg ~ cyl 
#> 3 (Intercept)  29.6      1.23        24.1  3.58e-21 mpg ~ disp
#> 4 disp         -0.0412   0.00471     -8.75 9.38e-10 mpg ~ disp
#> 5 (Intercept)  30.1      1.63        18.4  6.64e-18 mpg ~ hp  
#> 6 hp           -0.0682   0.0101      -6.74 1.79e- 7 mpg ~ hp

#To have the regression results i.e. R2, AIC, BIC
tmp2 = bind_rows(lapply(out, function(frml) {
  a = glance(lm(frml, data=mtcars))
  a$frml = frml
  return(a)
}))
head(tmp2)
#> # A tibble: 6 x 12
#>   r.squared adj.r.squared sigma statistic  p.value    df logLik   AIC   BIC
#>       <dbl>         <dbl> <dbl>     <dbl>    <dbl> <int>  <dbl> <dbl> <dbl>
#> 1     0.726         0.717  3.21     79.6  6.11e-10     2  -81.7  169.  174.
#> 2     0.718         0.709  3.25     76.5  9.38e-10     2  -82.1  170.  175.
#> 3     0.602         0.589  3.86     45.5  1.79e- 7     2  -87.6  181.  186.
#> 4     0.464         0.446  4.49     26.0  1.78e- 5     2  -92.4  191.  195.
#> 5     0.753         0.745  3.05     91.4  1.29e-10     2  -80.0  166.  170.
#> 6     0.175         0.148  5.56      6.38 1.71e- 2     2  -99.3  205.  209.
#> # ... with 3 more variables: deviance <dbl>, df.residual <int>, frml <chr>

write.csv(tmp1, "Try_lm_coefficients.csv")
write.csv(tmp2, "Try_lm_results.csv")

reprex package(v0.3.0)于2019-11-11创建

结果可在“ Try_lm_coefficients.csv”和“ Try_lm_results.csv”文件中找到。