R通过三个变量组来汇总数据

时间:2014-07-16 12:52:39

标签: r crosstab summary

我想通过移植实验数据总结得到a)总个体和b)每个位置,底物和复制组合的性别总数。我提供了一个简化的数据集,其中包含每个站点,基板和复制组合的两个记录。我知道如何在R中创建一个列联表,但不知道如何创建一个表(数据帧),我总结了三个变量的数据。

Transplant.Test <- structure(list(Location = c("Kampinge", "Kampinge", "Kampinge", "Kampinge",
                                               "Kampinge", "Kampinge", "Kampinge", "Kampinge",
                                               "Kampinge", "Kampinge", "Kampinge", "Kampinge",
                                               "Kaseberga", "Kaseberga", "Kaseberga", 
                                               "Kaseberga", "Kaseberga", "Kaseberga", 
                                               "Kaseberga", "Kaseberga", "Kaseberga", 
                                               "Kaseberga", "Kaseberga", "Kaseberga"),
                                  Substrate = c("Kampinge", "Kampinge", "Kampinge", "Kampinge",
                                                "Kampinge", "Kampinge", "Kaseberga","Kaseberga",
                                                "Kaseberga", "Kaseberga", "Kaseberga",
                                                "Kaseberga", "Kampinge", "Kampinge",
                                                "Kampinge", "Kampinge", "Kampinge", "Kampinge",
                                                "Kaseberga", "Kaseberga", "Kaseberga",
                                                "Kaseberga", "Kaseberga", "Kaseberga"),
                                 Replicate = c(1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L,
                                               1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L), 
                                 Sex = c("m", "m", "m", "m", "m", "m", "m", "m", "m", "m", "m",
                                         "m", "m", "f", "f", "f", "m", "f", "f", "f", "f", "f",
                                         "m", "m")), 
                                .Names = c("Location", "Substrate", "Replicate", "Sex"), 
                                class = "data.frame", row.names = c(NA, -24L))

结果将是两个数据表;表A将具有&#34;位置&#34;,&#34;基板&#34;,&#34;复制&#34;和&#34; Total&#34;和表B将有&#34;位置&#34;,&#34; Subtrate&#34;,&#34;复制&#34;,&#34;男&#34;和&#34;女性&#34;作为专栏。

表B看起来像: enter image description here

然而,表A只有总数而不是&#34;男性&#34;和女性&#34;。

2 个答案:

答案 0 :(得分:2)

您可能对“reshape2”中的dcast感兴趣。

尝试以下方法:

library(reshape2)
dcast(Transplant.Test, Location + Substrate + Replicate ~ "count", 
      value.var="Sex", fun.aggregate=length)
#     Location Substrate Replicate count
# 1   Kampinge  Kampinge         1     2
# 2   Kampinge  Kampinge         2     2
# 3   Kampinge  Kampinge         3     2
# 4   Kampinge Kaseberga         1     2
# 5   Kampinge Kaseberga         2     2
# 6   Kampinge Kaseberga         3     2
# 7  Kaseberga  Kampinge         1     2
# 8  Kaseberga  Kampinge         2     2
# 9  Kaseberga  Kampinge         3     2
# 10 Kaseberga Kaseberga         1     2
# 11 Kaseberga Kaseberga         2     2
# 12 Kaseberga Kaseberga         3     2

dcast(Transplant.Test, Location + Substrate + Replicate ~ Sex, 
      value.var="Sex", fun.aggregate=length)
#     Location Substrate Replicate f m
# 1   Kampinge  Kampinge         1 0 2
# 2   Kampinge  Kampinge         2 0 2
# 3   Kampinge  Kampinge         3 0 2
# 4   Kampinge Kaseberga         1 0 2
# 5   Kampinge Kaseberga         2 0 2
# 6   Kampinge Kaseberga         3 0 2
# 7  Kaseberga  Kampinge         1 1 1
# 8  Kaseberga  Kampinge         2 2 0
# 9  Kaseberga  Kampinge         3 1 1
# 10 Kaseberga Kaseberga         1 2 0
# 11 Kaseberga Kaseberga         2 2 0
# 12 Kaseberga Kaseberga         3 0 2

答案 1 :(得分:0)

require(plyr)

Table_A <- count(Transplant.Test, c('Location','Substrate', 'Replicate')) names(Table_A) <- c("Location", "Substrate", "Replicate", "Total")

Table_B <- count(Transplant.Test, c('Location','Substrate', 'Replicate', 'Sex')) names(Table_B) <- c("Location", "Substrate", "Replicate", "Sex", 'Total')