转换为R中的频率表

时间:2018-01-05 07:05:31

标签: r dplyr reshape2

我的表格如下:

> dt <- data.frame(C1 = c("one", "two", "one"), C2 = c("one", "two", "two"))
> dt
   C1  C2
1 one one
2 two two
3 one two

现在我需要上面的表格:

> dt <- data.frame(var = c("one", "two"), C1 = c(2, 1), C2 = c(1, 2))
> dt
  var C1 C2
1 one  2  1
2 two  1  2

我尝试过各种各样的事情/功能但无法得到结果。

3 个答案:

答案 0 :(得分:4)

使用tidyverse的选项是gather进入&#39; long&#39;格式,将countspread恢复为&#39;范围

library(dplyr)
library(tidyr)
gather(dt, key, val) %>%
      count(key, val) %>%
      spread(key, n)
# A tibble: 2 x 3
#  val      C1    C2
#* <chr> <int> <int>
#1 one       2     1
#2 two       1     2

如果我们只对频率感兴趣,请将summarise_alltabulate

一起使用
dt %>%
    summarise_all(funs(list(tabulate(.)))) %>% 
    unnest 

或使用base R

sapply(dt, table)

答案 1 :(得分:3)

下面的另一个解决方案:
    1.使用reshape lib来融化数据     2.创建表格和转置(因为熔化将变量var放在前面)

> dt <- data.frame(C1 = c("one", "two", "one"), C2 = c("one", "two", "two"))
> dt

   C1  C2
1 one one
2 two two
3 one two

> library(reshape)
> t(table(melt(dt, measure.vars = c("C2", "C1"))))

     variable
value C2 C1
  one  1  2
  two  2  1

答案 2 :(得分:2)

这是一个使用基础R的解决方案,当表中的某些因素不存在于每列中时,该解决方案将起作用。

> dt <- data.frame(C1 = c("one", "two", "one", "one"), C2 = c("one", "two", "two", "three"))
> dt
   C1    C2
1 one   one
2 two   two
3 one   two
4 one three
> globalLevels <- as.character(unique(unlist(dt)))
> as.data.frame(lapply(dt, function(x) summary(factor(x, globalLevels))))
      C1 C2
one    3  1
two    1  2
three  0  1