我在R中有以下最小例子:
testing = data.frame(c("Once a week", "Once a week", "Rarely", "Once a month", "Once a month"), c("Once a month", "Once a month", "Once a week", "Rarely", "Rarely"))
colnames(testing) = c("one", "two")
testing
one two
1 Once a week Once a month
2 Once a week Once a month
3 Rarely Once a week
4 Once a month Rarely
5 Once a month Rarely
我希望最终结果是一个数据框,其中一列中包含所有可能的分类因子,其余列是每个列/变量的计数,如下所示:
categories one two
Rarely 1 2
Once a month 2 2
Once a week 2 1
我对R库没有任何限制,所以这里最简单的是什么(可能是plyr
/ dplyr
?)。
感谢。
答案 0 :(得分:7)
表无需外部包即可使用:
sapply(testing, table)
# one two
#Once a month 2 2
#Once a week 2 1
#Rarely 1 2
答案 1 :(得分:2)
您可以使用tidyr
和dplyr
个包来整理您的表格,并使用基本table
函数计算类别
testing = data.frame(c("Once a week", "Once a week", "Rarely", "Once a month", "Once a month"), c("Once a month", "Once a month", "Once a week", "Rarely", "Rarely"))
colnames(testing) = c("one", "two")
testing
#> one two
#> 1 Once a week Once a month
#> 2 Once a week Once a month
#> 3 Rarely Once a week
#> 4 Once a month Rarely
#> 5 Once a month Rarely
library(tidyr)
library(dplyr)
testing %>%
gather("type", "categories") %>%
table()
#> categories
#> type Once a month Once a week Rarely
#> one 2 2 1
#> two 2 1 2
# or reorder colum before table
testing %>%
gather("type", "categories") %>%
select(categories, type) %>%
table()
#> type
#> categories one two
#> Once a month 2 2
#> Once a week 2 1
#> Rarely 1 2
答案 2 :(得分:2)
以下是利用tidyr::gather
,tidyr::spread
和dplyr::count
的另一种方式:
library(dplyr)
library(tidyr)
testing %>%
gather(measure, value) %>%
count(measure, value) %>%
spread(measure, n)
# Source: local data frame [3 x 3]
#
# value one two
# (chr) (int) (int)
# 1 Once a month 2 2
# 2 Once a week 2 1
# 3 Rarely 1 2
此外,请参阅此fantastic gist有关此主题的内容。