在R中创建更好的频率表

时间:2014-03-13 17:03:13

标签: r plyr frequency-distribution

以下是一些数据:

dta <- data.frame(
  id = 1:10, 
  code1 = as.factor(sample(c("male", "female"), 10, replace = TRUE)),
  code2 = as.factor(sample(c("yes", "no", "maybe"), 10, replace = TRUE)),
  code3 = as.factor(sample(c("yes", "no"), 10, replace = TRUE))
)

我想要一个格式良好的代码变量频率表。

codes <- c("code1", "code2", "code3")

例如,我们可以运行内置命令table

> sapply(dta[, codes], table)
$code1

female   male 
     4      6 

$code2

maybe    no   yes 
    5     2     3 

$code3

 no yes 
  4   6 

所有的信息都在这里,但最好是有一张桌子:

library(plyr)
ddply(dta, .(code1), summarize, n1 = length(code1))
   code1 n1
1 female  4
2   male  6

这三次。可以是单独的数据帧,也可以是一个。

我们如何循环变量?或任何其他方法。

2 个答案:

答案 0 :(得分:1)

您可以将lapplyas.data.frame(table)

一起使用
codes <- c("code1", "code2", "code3")
tbl<-lapply(dta[, codes], as.data.frame(table))

哪个会给你:

tbl
$code1
  value.Var1 value.Freq
1     female          6
2       male          4

$code2
  value.Var1 value.Freq
1      maybe          4
2         no          5
3        yes          1

$code3
  value.Var1 value.Freq
1         no          4
2        yes          6

因此,您可以使用tbl$code1tbl$code2等访问每个数据框。例如:

tbl$code1
  value.Var1 value.Freq
1     female          6
2       male          4

答案 1 :(得分:1)

这项工作(通用,因为它不需要先知道“代码”)?

library(plyr)
library(reshape)

dta <- data.frame(
  id = 1:10, 
  code1 = as.factor(sample(c("male", "female"), 10, replace = TRUE)),
  code2 = as.factor(sample(c("yes", "no", "maybe"), 10, replace = TRUE)),
  code3 = as.factor(sample(c("yes", "no"), 10, replace = TRUE))
)

d1 <- melt(dta, "id")

d2 <- count(d1, .(variable, value))

d3 <- by(d2, d2$variable, function(x) {
  v <- as.character(x[1,]$variable)
  y <- x[,2:3]
  colnames(y) <- c(v, "n1")
  return(y)
})

d3 

## d2$variable: code1
##    code1 n1
## 1 female  6
## 2   male  4
## ----------------------------------------------------------------------- 
## d2$variable: code2
##   code2 n1
## 3 maybe  2
## 4    no  5
## 5   yes  3
## ----------------------------------------------------------------------- 
## d2$variable: code3
##   code3 n1
## 6    no  4
## 7   yes  6