聚合字符串列和计数选择字符串

时间:2012-05-13 13:44:07

标签: r

我在R中有一个像这样的数据框

> head(data)
  var1 var2    var3    var4  level
1  Yes  Yes Unknown      No level1
2   No  Yes      No     Yes level2
3  Yes   No     Yes Unknown level1
4  Yes  Yes Unknown     Yes level2

我希望每个级别都有“是”,比如

        var1 var2 var3 var4
level1     2    1    1    0
level2     1    2    0    2

任何提示?非常感谢。

2 个答案:

答案 0 :(得分:0)

首先我创建了一个虚假的数据框......

# first a handy helper function :-)
set.seed(123)
f1 <- function(N) { sample(c('Yes','No','Unknown'), N, replace=TRUE) }
# how many rows do I want in my fake data.frame?
N <- 15
data <- data.frame(var1=f1(N),var2=f1(N),var3=f1(N),var4=f1(N),
     level=sample(c('level1','level2'),N, replace=TRUE))

现在我使用表函数...

# this gives me how many 'Yes' as the TRUE column for the 'var1' column
table(data$level, data$var1=='Yes')
          FALSE TRUE
  level1     5    2
  level2     7    1

# alternatively I just say for all the levels of var1
table(data$level, data$var1)        
         No Unknown Yes
  level1  3       2   2
  level2  3       4   1

然后可以将其扩展到您正在寻找的内容......

或使用plyr ......

library(plyr)
f2 <- function(x, c='Yes') { length(which(x==c)) }
f3 <- function(ndf) { res <- c(); for(i in 1:ncol(ndf)) res <- c(res, f2(ndf[,i])) ; res } 
ddply(data, .(level), f3)
   level V1 V2 V3 V4 V5
1 level1  2  5  2  3  0
2 level2  1  0  4  4  0

答案 1 :(得分:-1)

data.aux <- data.frame(data[,1:3] == "Yes", level = data[, 'level'])
uni <- unique(data[, 'level'])
f <- sapply(uni, function(x) colSums(data.aux[data.aux[, 'level']==x, 1:3]))
data.frame(t(f), levels = uni)