使用聚合函数计算唯一总计

时间:2016-12-12 03:07:26

标签: r aggregate

我有问题能够计算独特育种代码的种植和死亡计数的总和。我尝试使用聚合函数来执行此操作。

Goal:

##      rowBrCrCode rowPlanted rowPsaDeath rowSurvival
##   1:      GL_287         63          24           0
##   2:      GL_287         13           7           0
##   3:      GL_287         67          26           0
##   4:     aCK_227         17           5           0
##   5:     aCK_406         20           1           0

into

##      rowBrCrCode rowPlanted rowPsaDeath rowSurvival
##   1:      GL_287        143          57           0
##   2:     aCK_227         17           5           0
##   3:     aCK_406         20           1           0

将在此功能之后计算rowSurvival。

当前代码在这里输入代码,有问题的代码被注释掉(我在R中写道):

library(stringr)
library(data.table)
library(plyr)
test <- fread(file.choose(), header = TRUE, data.table = TRUE)
testRowLength <- nrow(test)
rowBrCrCode <- ""
rowPlanted <- 0
rowPsaDeath <- 0
rowSurvival <- 0
for(i in 1:testRowLength){
  if(test$BrCrCode[i] == ""){
    test$BrCrCode[i] <- paste("PH_", i, sep = "")
    print(paste("hit found, turned nothing into ", test$BrCrCode[i], sep = ""))
  }
  slashCount <- str_count(test$BrCrCode[i], '/')
  if(slashCount == 1){
    print(paste("hit found, turn ", test$BrCrCode[i], " into ", unlist(strsplit(test$BrCrCode[i], split = '/', fixed=TRUE))[1], sep = ""))
    test$BrCrCode[i] <- unlist(strsplit(test$BrCrCode[i], split = '/', fixed=TRUE))[1]
  }
  else if(slashCount > 1){
    print(paste("control found, value ", test$BrCrCode[i], " into ", paste("control_", unlist(strsplit(test$BrCrCode[i], split = '/', fixed=TRUE))[1], sep = ""), sep = ""))
    test$BrCrCode[i] <- paste("control_", unlist(strsplit(test$BrCrCode[i], split = '/', fixed=TRUE))[1], sep = "")
  }
  rowBrCrCode[i] <- test$BrCrCode[i]
  rowPlanted[i] <- test$Planted[i]
  rowPsaDeath[i] <- test$PsaDeath[i]
}
firstDT <- data.table(rowBrCrCode, rowPlanted, rowPsaDeath, rowSurvival)
print(firstDT)
#firstDT_agg <- aggregate(x = firstDT, by = list(rowPlanted, rowPsaDeath), FUN = "sum")

2 个答案:

答案 0 :(得分:1)

数据集似乎是data.table,因此我们可以使用data.table方法。通过'rowBrCrCode'分组,我们遍历Data.table的子集(.SD)并获取sum

library(data.table)
dt[, lapply(.SD, sum), by = rowBrCrCode]
#       rowBrCrCode rowPlanted rowPsaDeath rowSurvival
#1:      GL_287        143          57           0
#2:     aCK_227         17           5           0
#3:     aCK_406         20           1           0

答案 1 :(得分:0)

如果您想要做的只是聚合 -

,那应该就是这样
test2 <- aggregate(cbind(rowPlanted, rowPsaDeath, rowSurvival) ~ BrCrCode,
                   data = test, FUN = sum)