将数据集细分为列表

时间:2018-06-23 17:03:13

标签: r label lapply paste anova

我是R的新手。我花了数小时试图弄清楚这一点,然后搜索Google和SO,但似乎找不到我要找的东西。希望您能提供帮助?

我有一个看起来像这样的数据集:

Site(factor)    Species           Date               Mass       GDD
1               cockerelli      0017-03-14           2.73       252.1
2               doddsii         0017-01-12           3.73       583.4
4               cockerelli      0017-03-14           2.71       385.4
4               doddsii         0018-05-16           2.22       783.2
1               infrequens      0018-05-16           2.89       583.0
etc.

我将数据帧拆分()为数据帧列表,然后将其传递给apply()函数。

splitdata = split(data, paste(data$Species,data$Site))

但是,当我使用以下代码时:

grmodel = lapply(splitdata, function(x){
  grmodel = aov(x$Mass~x$GDD)
  print(summary(grmodel))
 })

我有大量的ANOVA摘要(如下面的摘要),但我不知道它们属于哪个物种和地点。

          Df   Sum Sq   Mean Sq F value Pr(>F)
 x$GDD        1 0.000022 0.0000216   0.044  0.838
 Residuals    9 0.004396 0.0004884               
 1 observation deleted due to missingness
           Df    Sum Sq   Mean Sq F value Pr(>F)
 x$GDD        1 0.0002526 0.0002526    0.65  0.451
 Residuals    6 0.0023319 0.0003887               
 1 observation deleted due to missingness

我想知道是否有人知道如何更改代码以告诉我方差分析表属于哪个物种和地点?我找到了一些有关paste()和其他函数的答案,但是没有尝试过。

非常感谢!

2 个答案:

答案 0 :(得分:0)

据我所知,这些名称应该可见,我不确定您所看到的是什么,但也许reprex会有用。

您也可以尝试使用tidy::broom看得更清楚:

lapply(split(iris,iris$Species),
       function(x) aov(Petal.Length ~ Petal.Width,x))

# $`setosa`
# Call:
#   aov(formula = Petal.Length ~ Petal.Width, data = x)
# 
# Terms:
#   Petal.Width Residuals
# Sum of Squares    0.1625262 1.3152738
# Deg. of Freedom           1        48
# 
# Residual standard error: 0.1655341
# Estimated effects may be unbalanced
# 
# $versicolor
# Call:
#   aov(formula = Petal.Length ~ Petal.Width, data = x)
# 
# Terms:
#   Petal.Width Residuals
# Sum of Squares     6.695921  4.124079
# Deg. of Freedom           1        48
# 
# Residual standard error: 0.2931183
# Estimated effects may be unbalanced
# 
# $virginica
# Call:
#   aov(formula = Petal.Length ~ Petal.Width, data = x)
# 
# Terms:
#   Petal.Width Residuals
# Sum of Squares     1.548503 13.376297
# Deg. of Freedom           1        48
# 
# Residual standard error: 0.5278947
# Estimated effects may be unbalanced

使用tidy::broom

lapply(split(iris,iris$Species),
       function(x) aov(Petal.Length ~ Petal.Width,x) %>% broom::tidy())

# $`setosa`
#          term df     sumsq     meansq statistic    p.value
# 1 Petal.Width  1 0.1625262 0.16252620   5.93128 0.01863892
# 2   Residuals 48 1.3152738 0.02740154        NA         NA
# 
# $versicolor
#          term df    sumsq     meansq statistic      p.value
# 1 Petal.Width  1 6.695921 6.69592109  77.93357 1.271916e-11
# 2   Residuals 48 4.124079 0.08591831        NA           NA
# 
# $virginica
#          term df     sumsq    meansq statistic    p.value
# 1 Petal.Width  1  1.548503 1.5485033  5.556707 0.02253577
# 2   Residuals 48 13.376297 0.2786728        NA         NA

答案 1 :(得分:0)

split的结果名称是强制转换为character-class的第二个参数的值,并且lapply保留了这些名称,因此您无需添加任何名称,而只是看一下:

 names(grmodel)

也许您想这样做以输出:

 for( i in names(grmodel) ){ cat(i);
                              cat( : : :\n");
                               print(grmodel[[i]]);
                                cat("\n\n")}

....仅打印grmodels列表中每个项目的名称和一些空格。