我试图了解如何汇总我的输出。我创建了一些虚拟数据,它们近似于我的实际数据,即:数百个group1,3个level2组,以及几十个验证逻辑。抱歉,如果这看起来很简单,我已经狩猎和啄了很多,并且不得不说作为R的新手,各种各样的工具(应用系列,ddply,聚合,表,重塑等)在那里既美妙又有点可怕:))
#create data
group1 <- paste("Group", rep(LETTERS[1:7], sep=''))
group2 <- c("UNC", "UNC", "SS", "LS", "LS", "SS", "UNC")
valid1 <- c("Y", "N", NA, "N", "Y", "Y", "N")
valid2 <- c("N", "N", "Y", "N", "N", "Y", "N")
valid3 <- c(1.4, 1.2, NA, 0.7, 0.3, NA, 1.7)
valid4 <- c(0.4, 0.3, 0.53, 0.66, 0.3, 0.3, 0.71)
valid5 <- c(8.5, 11.2,NA, NA, 8.3, NA, 11.7)
testdata <- data.frame(cbind(group, group2, valid1, valid2, valid3, valid4, valid5))
valid <- function(testdata){
for(i in group)
val1 <- ifelse(valid1=="Y", 1,0)
val2 <- ifelse(valid2=="Y", 1,0)
val3 <- ifelse(valid3>=1.0, 1,0)
val4 <- ifelse(valid4<=0.5, 1,0)
val5 <- ifelse(valid5>=10.0, 1,0)
test.out <- data.frame(cbind(group1,group2, val1, val2, val3, val4, val5))
}
validtry <- valid(testdata)'
然后,我需要将这些逻辑转换为数字,以便将它们相加:
#make validations numeric
# why doesn't this work:
# validtry[,3:7] <- as.numeric(validtry[,3:7])
#but these do
validtry[,3] <- as.numeric(validtry[,3])
validtry[,4] <- as.numeric(validtry[,4])
validtry[,5] <- as.numeric(validtry[,5])
validtry[,6] <- as.numeric(validtry[,6])
validtry[,7] <- as.numeric(validtry[,7])
######
#summarize validtry
#sum on both groups
aggregate(validtry[,3:7], by=list(validtry$group1, validtry$group2), sum, na.rm=T)
#sum on one group
aggregate(validtry[,3:7], by=list(validtry$group2), sum, na.rm=T)
所以,最后两个让我接近,但我想我需要一些不同的东西?我试图对两组的行和列进行求和。我熟悉tapply,但似乎并没有得到它。
提前感谢!!
答案 0 :(得分:0)
目前尚不清楚预期产量。我的猜测是:
testdata <- data.frame(group1, group2, valid1, valid2, valid3, valid4, valid5)
str1 <- c("valid1=='Y'", "valid2=='Y'", "valid3>=1.0", "valid4 <=0.5", "valid5>=10.0")
validtry <- testdata
#Though I used eval(parse(...)), it is not that recommended
validtry[,-(1:2)] <- lapply(str1, function(x) 1*with(testdata, eval(parse(text=x))))
library(reshape2)
lst <- lapply(validtry[3:7], function(x)
dcast(data.frame(validtry[1:2], x), group1~group2, value.var="x", sum, na.rm=TRUE))
lst[[1]]
# group1 LS SS UNC
#1 Group A 0 0 1
#2 Group B 0 0 0
#3 Group C 0 0 0
#4 Group D 0 0 0
#5 Group E 1 0 0
#6 Group F 0 1 0
#7 Group G 0 0 0