Question

这是我拥有的数据集的可复制示例：

Health <- data.frame(id=c(1,1,2,3,3),
                     ethnicity = c(2,2,2,1,1),
                     dead=c(0,0,0,0,1), 
                     Asthma = c(1,1,1,0,0),
                     Diabetes = c(1,0,1,0,1),
                     Sex = c("M","F","M","M","M"))

我想知道如何计算每个值在数据集中出现的实例数量，然后将其导出为单独的表。

我的预期结果如下：

          0  1  2
Ethnicity    3  2
Asthma    2  3
Dead      4  1
Diabetes  2  3

Answer 1

您需要先转换为长格式，然后进行计数和传播，即

library(tidyverse)

Health %>% 
 gather(var, val, -c(id, Sex)) %>% 
 group_by(var, val) %>% 
 count() %>% 
 spread(val, n, fill = 0)

给出，

# A tibble: 4 x 4
# Groups:   var [4]
   var         `0`   `1`   `2`
  <chr>     <int> <int> <int>
1 Asthma        2     3     0
2 dead          4     1     0
3 Diabetes      2     3     0 
4 ethnicity     0     2     3

这个概念的基本R解决方案可以是（@markus的补全），

t(table(stack(Health[, 2:5])))
#           values
#ind         0 1 2
#  ethnicity 0 2 3
#  dead      4 1 0
#  Asthma    2 3 0
#  Diabetes  2 3 0

Answer 2

编辑：这类似于Sotos答案。

table()之后，您可以使用gather以长格式：

library(dplyr)

Health %>% 
    select(-id, -Sex) %>% 
    tidyr::gather("key", "value") %>% 
    group_by(key) %>% 
    table()

#            value
#key         0 1 2
#  Asthma    2 3 0
#  dead      4 1 0
#  Diabetes  2 3 0
#  ethnicity 0 2 3

如何找到R数据集中不同列中出现值的实例数？

2 个答案: