我想获得两个数据帧的行的总和。这是我的意见:
input <- 'name sample1 sample2 sample3
pr_001A 300 200 300
pr_001B 233 211 333
pr_002A 244 214 214
pr_002B 234 233 213'
input <- read.table(text=input, header=T)
获取此输出:
output <- 'name sample1 sample2 sample3
pr_001 533 411 633
pr_002 478 447 427'
output <- read.table(text=output, header=T)
因此,对于pr_001
中的sample1
,结果为300 + 233 = 533
,并且所有样本和名称必须遵循相同的逻辑。有些想法可以解决这个问题吗?谢谢!
答案 0 :(得分:4)
这是 data.table
的选项library(data.table)
setDT(input)[, lapply(.SD, sum), by = .(name = sub(".$", "", name))]
# name sample1 sample2 sample3
# 1: pr_001 533 411 633
# 2: pr_002 478 447 427
或使用aggregate()
公式方法(@rawr已经在评论中显示了data.frame方法)
aggregate(. ~ cbind(name = sub(".$", "", input$name)), input[-1], sum)
# name sample1 sample2 sample3
# 1 pr_001 533 411 633
# 2 pr_002 478 447 427
另一个使用 dplyr
library(dplyr)
summarise_each(group_by(input, name = sub(".$", "", name)), funs(sum))
# Source: local data frame [2 x 4]
#
# name sample1 sample2 sample3
# (chr) (int) (int) (int)
# 1 pr_001 533 411 633
# 2 pr_002 478 447 427
答案 1 :(得分:0)
选项plyr
library(plyr)
input <- 'name sample1 sample2 sample3
pr_001A 300 200 300
pr_001B 233 211 333
pr_002A 244 214 214
pr_002B 234 233 213'
input <- read.table(text=input, header=T)
input$aux_name = gsub("[A-Z]$","",input$name)
result = ddply(input, .(aux_name),summarize, sum(sample1), sum(sample2),sum(sample3))
View(result)
aux_name ..1 ..2 ..3
1 pr_001 533 411 633
2 pr_002 478 447 427