r中的拆分和迭代简单回归

时间:2014-11-13 19:46:11

标签: r regression plyr

我对r很新,我有一个更大的桌子下面的虚拟例子。我想基于id变量(a,b,c,d)拆分表,并对每个子集进行迭代简单线性回归:x是我的x变量,列1:6是y变量,我想回归每个y1-y6对x,每组有6个回归,每组有24组系数。此外,如果我可以将斜率的模型p值输出到新的数据框中,那将是很好的。

    id x  1  2  3  4  5  6
1   a 74 18 19 NA 23 29  1
2   a 77 16 19 17 22 29  2
3   a 79 16 NA 19 23 29  3
4   a 81 17 20 18 23 29  4
5   b 74 19 20 19 23 28 11
6   b 76 15 19 18 26 28 12
7   b 79 19 21 20 24 28 NA
8   b 81 19 21 20 23 28 14
9   c 68 19 20 20 23 29  8
10  c 70 17 22 22 27 29  9
11  c 73 18 22 21 23 29 10
12  c 75 19 20 19 23 29 11
13  d 65 18 18 19 22 28  5
14  d 68 18 NA 18 20 29  6
15  d 70 18 19 18 23 28  7
16  d 72 19 17 19 22 28  8`

我尝试使用plyr包但它没有成功

for ( i in 3:ncol(dumm)){
regression[i] <- dlply(dumm, .(id), function(z) lm(dumm[,i]~dumm$x, z))
}
coefs <- ldply(regression, coef)

非常感谢你!

2 个答案:

答案 0 :(得分:0)

以下是使用bylapply的方法:

by(dumm[-1], dumm$id, function(d) {
  lapply(d[-1], function(r) lm(r ~ d$x))
})

您可以使用以下命令与p值一起收到摘要:

by(dumm[-1], dumm$id, function(d) {
  lapply(d[-1], function(r) coef(summary(lm(r ~ d$x))))
})

答案 1 :(得分:0)

尝试:

> lapply(split(ddf, ddf$id), function(x) lapply(x[3:8], function(y) lm(y~x[[2]])  ))
$a
$a$X1

Call:
lm(formula = y ~ x[[2]])

Coefficients:
(Intercept)       x[[2]]  
    29.1028      -0.1589  


$a$X2

Call:
lm(formula = y ~ x[[2]])

Coefficients:
(Intercept)       x[[2]]  
     7.8378       0.1486  

....

或摘要:

> lapply(split(ddf, ddf$id), function(x) lapply(x[3:8], function(y) summary(lm(y~x[[2]]))  ))
$a
$a$X1

Call:
lm(formula = y ~ x[[2]])

Residuals:
      1       2       3       4 
 0.6542 -0.8692 -0.5514  0.7664 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  29.1028    15.3196   1.900    0.198
x[[2]]       -0.1589     0.1969  -0.807    0.504

Residual standard error: 1.019 on 2 degrees of freedom
Multiple R-squared:  0.2455,    Adjusted R-squared:  -0.1317 
F-statistic: 0.6509 on 1 and 2 DF,  p-value: 0.5045


$a$X2

Call:
lm(formula = y ~ x[[2]])

Residuals:
      1       2       4 
 0.1622 -0.2838  0.1216 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  7.83784    5.43394   1.442    0.386
x[[2]]       0.14865    0.07022   2.117    0.281

Residual standard error: 0.3487 on 1 degrees of freedom
  (1 observation deleted due to missingness)
Multiple R-squared:  0.8176,    Adjusted R-squared:  0.6351 
F-statistic: 4.481 on 1 and 1 DF,  p-value: 0.2809

...