我有问题能够计算独特育种代码的种植和死亡计数的总和。我尝试使用聚合函数来执行此操作。
Goal:
## rowBrCrCode rowPlanted rowPsaDeath rowSurvival
## 1: GL_287 63 24 0
## 2: GL_287 13 7 0
## 3: GL_287 67 26 0
## 4: aCK_227 17 5 0
## 5: aCK_406 20 1 0
into
## rowBrCrCode rowPlanted rowPsaDeath rowSurvival
## 1: GL_287 143 57 0
## 2: aCK_227 17 5 0
## 3: aCK_406 20 1 0
将在此功能之后计算rowSurvival。
当前代码在这里输入代码,有问题的代码被注释掉(我在R中写道):
library(stringr)
library(data.table)
library(plyr)
test <- fread(file.choose(), header = TRUE, data.table = TRUE)
testRowLength <- nrow(test)
rowBrCrCode <- ""
rowPlanted <- 0
rowPsaDeath <- 0
rowSurvival <- 0
for(i in 1:testRowLength){
if(test$BrCrCode[i] == ""){
test$BrCrCode[i] <- paste("PH_", i, sep = "")
print(paste("hit found, turned nothing into ", test$BrCrCode[i], sep = ""))
}
slashCount <- str_count(test$BrCrCode[i], '/')
if(slashCount == 1){
print(paste("hit found, turn ", test$BrCrCode[i], " into ", unlist(strsplit(test$BrCrCode[i], split = '/', fixed=TRUE))[1], sep = ""))
test$BrCrCode[i] <- unlist(strsplit(test$BrCrCode[i], split = '/', fixed=TRUE))[1]
}
else if(slashCount > 1){
print(paste("control found, value ", test$BrCrCode[i], " into ", paste("control_", unlist(strsplit(test$BrCrCode[i], split = '/', fixed=TRUE))[1], sep = ""), sep = ""))
test$BrCrCode[i] <- paste("control_", unlist(strsplit(test$BrCrCode[i], split = '/', fixed=TRUE))[1], sep = "")
}
rowBrCrCode[i] <- test$BrCrCode[i]
rowPlanted[i] <- test$Planted[i]
rowPsaDeath[i] <- test$PsaDeath[i]
}
firstDT <- data.table(rowBrCrCode, rowPlanted, rowPsaDeath, rowSurvival)
print(firstDT)
#firstDT_agg <- aggregate(x = firstDT, by = list(rowPlanted, rowPsaDeath), FUN = "sum")
答案 0 :(得分:1)
数据集似乎是data.table
,因此我们可以使用data.table
方法。通过'rowBrCrCode'分组,我们遍历Data.table的子集(.SD
)并获取sum
。
library(data.table)
dt[, lapply(.SD, sum), by = rowBrCrCode]
# rowBrCrCode rowPlanted rowPsaDeath rowSurvival
#1: GL_287 143 57 0
#2: aCK_227 17 5 0
#3: aCK_406 20 1 0
答案 1 :(得分:0)
如果您想要做的只是聚合 -
,那应该就是这样test2 <- aggregate(cbind(rowPlanted, rowPsaDeath, rowSurvival) ~ BrCrCode,
data = test, FUN = sum)