R-使用循环或函数迭代变量名称

时间:2018-06-05 05:23:27

标签: r list function loops dataframe

我想使用for循环或R中的函数循环数据框中的变量。我编写了以下代码(它不起作用):

y <- c(0,0,1,1,0,1,0,1,1,1)
var1 <- c("a","a","a","b","b","b","c","c","c","c")
var2 <- c("m","m","n","n","n","n","o","o","o","m")

mydata <- data.frame(y,var1,var2)

myfunction <- function(v){
  regressionresult <- lm(y ~ v, data = mydata)
  summary(regressionresult) 
}
myfunction("var1")

当我尝试运行时,我收到错误消息:

Error in model.frame.default(formula = y ~ v, data = mydata, drop.unused.levels = TRUE) : variable lengths differ (found for 'v')

我不认为这是数据的问题,但是我如何引用变量名称,因为以下代码产生了所需的回归结果(对于我想要循环的一个变量):

regressionresult <- lm(y ~ var1, data = mydata) summary(regressionresult)

如何修复函数,或将变量名放在循环中?

[我也尝试循环变量名称,但遇到与函数类似的问题:

for(v in c("var1","var2")){
  regressionresult <- lm(y ~ v, data = mydata)
  summary(regressionresult)  
}

运行此循环时,会产生错误:

Error in model.frame.default(formula = y ~ v, data = mydata, drop.unused.levels = TRUE) : 
  variable lengths differ (found for 'v')

感谢您的帮助!

2 个答案:

答案 0 :(得分:0)

我们可以使用paste创建公式,将其传递到lm

myfunction <- function(v){
  regressionresult <- lm(paste0('y ~', v), data = mydata)
  summary(regressionresult) 
}
out1 <- myfunction("var1")

或使用glue::glue

myfunction <- function(v){
  regressionresult <- lm(glue::glue('y ~ {v}'), data = mydata)
  summary(regressionresult) 
 }
myfunction("var1")

答案 1 :(得分:0)

您可以使用tidyverse中的函数来处理整洁数据并将模型应用于不同的公式。

y <- c(0,0,1,1,0,1,0,1,1,1)
var1 <- c("a","a","a","b","b","b","c","c","c","c")
var2 <- c("m","m","n","n","n","n","o","o","o","m")

library(tidyverse)
mydata <- data_frame(y,var1,var2)

res <- mydata %>%
  # get data in long format - tidy format
  gather("var_type", "value", -y) %>%
  # we want one model per var_type
  nest(-var_type) %>%
  # apply lm on each data
  mutate(
    regressionresult = map(data, ~lm(y ~ value, data = .x))
  )
res
#> # A tibble: 2 x 3
#>   var_type data              regressionresult
#>   <chr>    <list>            <list>          
#> 1 var1     <tibble [10 x 2]> <S3: lm>        
#> 2 var2     <tibble [10 x 2]> <S3: lm>
summary(res$regressionresult[[1]])
#> 
#> Call:
#> lm(formula = y ~ value, data = .x)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -0.7500 -0.3333  0.2500  0.3125  0.6667 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)
#> (Intercept)   0.3333     0.3150   1.058    0.325
#> valueb        0.3333     0.4454   0.748    0.479
#> valuec        0.4167     0.4167   1.000    0.351
#> 
#> Residual standard error: 0.5455 on 7 degrees of freedom
#> Multiple R-squared:  0.1319, Adjusted R-squared:  -0.1161 
#> F-statistic: 0.532 on 2 and 7 DF,  p-value: 0.6094

Broom包可以帮助您处理结果

library(broom)
#> Warning: le package 'broom' a été compilé avec la version R 3.4.4
res <- res %>%
  mutate(tidy_summary = map(regressionresult, broom::tidy))
res         
#> # A tibble: 2 x 4
#>   var_type data              regressionresult tidy_summary        
#>   <chr>    <list>            <list>           <list>              
#> 1 var1     <tibble [10 x 2]> <S3: lm>         <data.frame [3 x 5]>
#> 2 var2     <tibble [10 x 2]> <S3: lm>         <data.frame [3 x 5]>

您可以获得其中一个摘要

res$tidy_summary[[1]]
#>          term  estimate std.error statistic   p.value
#> 1 (Intercept) 0.3333333 0.3149704 1.0583005 0.3250657
#> 2      valueb 0.3333333 0.4454354 0.7483315 0.4786436
#> 3      valuec 0.4166667 0.4166667 1.0000000 0.3506167

或者不需要使用data.frame来处理。

res %>% 
  unnest(tidy_summary)
#> # A tibble: 6 x 6
#>   var_type term        estimate std.error statistic p.value
#>   <chr>    <chr>          <dbl>     <dbl>     <dbl>   <dbl>
#> 1 var1     (Intercept)    0.333     0.315     1.06    0.325
#> 2 var1     valueb         0.333     0.445     0.748   0.479
#> 3 var1     valuec         0.417     0.417     1.000   0.351
#> 4 var2     (Intercept)    0.333     0.315     1.06    0.325
#> 5 var2     valuen         0.417     0.417     1       0.351
#> 6 var2     valueo         0.333     0.445     0.748   0.479

感兴趣的功能是来自[nest] [http://tidyr.tidyverse.org/)的unnesttidyr,可以轻松创建列表列,map来自{{3}允许迭代列表并应用purrr包中的函数(此处为lm)和tidy,该函数提供从模型中整理结果的函数(汇总结果,预测结果,。 ..)

此处未使用,但知道broom包有助于在建模时进行管道处理。