在R

时间:2015-11-04 01:21:55

标签: r dataframe linear-regression

我在R中有一个数据框,它有以下变量:每年的物种名称,年份和计数数据。我按如下方式执行了简单的线性回归,并在数据框中组织输出系数,如下所示:

 linmodel = data[,
                 list(intercept=coef(lm(x~year))[1], coef=coef(lm(x~year))[2]),
                 by=English_Common_Name]

此方法仅生成回归的截距和斜率。有没有办法获得p值和R平方值并将它们作为列放在输出数据框中?

以下是数据样本的样子:

     AOU    English_Common_Name year    x
 1  1320    Mallard 1995    444
 2  1320    Mallard 1996    550
 3  1320    Mallard 1997    335
 4  1320    Mallard 1998    351
 5  1320    Mallard 1999    266
 6  1320    Mallard 2000    597
 7  1320    Mallard 2001    620
 8  1320    Mallard 2002    246
 9  1320    Mallard 2003    635
 10 1320    Mallard 2004    301
 11 1320    Mallard 2005    211
 12 1320    Mallard 2006    191
 13 1320    Mallard 2007    223
 14 1320    Mallard 2008    210
 15 1320    Mallard 2009    219
 16 1320    Mallard 2010    166
 17 1320    Mallard 2011    115
 18 1320    Mallard 2012    92
 19 1320    Mallard 2013    47
 20 1320    Mallard 2014    100
 21 1350    Gadwall 1995    37
 22 1350    Gadwall 1996    12
 23 1350    Gadwall 1997    11
 24 1350    Gadwall 1998    11
 25 1350    Gadwall 1999    5
 26 1350    Gadwall 2000    3
 27 1350    Gadwall 2001    4
 28 1350    Gadwall 2002    6
 29 1350    Gadwall 2003    5
 30 1350    Gadwall 2004    9
 31 1350    Gadwall 2005    17
 32 1350    Gadwall 2006    4
 33 1350    Gadwall 2007    15
 34 1350    Gadwall 2008    16
 35 1350    Gadwall 2009    3
 36 1350    Gadwall 2010    23
 37 1350    Gadwall 2011    2

1 个答案:

答案 0 :(得分:2)

这可以使用dplyr do

完成
library(dplyr) 
library(magrittr)

result = 
  data %>%
  group_by(English_Common_Name) %>%
  do({
    result = lm(x ~ year, .)
    data_frame(r_squared = 
                 result %>% 
                 summary %>% 
                 use_series(adj.r.squared),
               p_value = 
                 result %>% 
                 anova %>% 
                 use_series(`Pr(>F)`) %>% 
                 extract2(1) ) %>%
      bind_cols(
        result %>%
          coef %>%
          as.list %>%
          as_data_frame)})