两行总和

时间:2015-10-02 19:28:15

标签: r sum

我想获得两个数据帧的行的总和。这是我的意见:

input <- 'name sample1 sample2 sample3
          pr_001A  300  200    300
          pr_001B  233  211   333
          pr_002A  244  214  214  
          pr_002B  234  233  213'
input <- read.table(text=input, header=T)

获取此输出:

output <- 'name sample1 sample2 sample3
              pr_001  533  411    633
              pr_002  478  447  427'  
output <- read.table(text=output, header=T)

因此,对于pr_001中的sample1,结果为300 + 233 = 533,并且所有样本和名称必须遵循相同的逻辑。有些想法可以解决这个问题吗?谢谢!

2 个答案:

答案 0 :(得分:4)

这是 data.table

的选项
library(data.table)
setDT(input)[, lapply(.SD, sum), by = .(name = sub(".$", "", name))]
#      name sample1 sample2 sample3
# 1: pr_001     533     411     633
# 2: pr_002     478     447     427

或使用aggregate()公式方法(@rawr已经在评论中显示了data.frame方法)

aggregate(. ~ cbind(name = sub(".$", "", input$name)), input[-1], sum)
#     name sample1 sample2 sample3
# 1 pr_001     533     411     633
# 2 pr_002     478     447     427

另一个使用 dplyr

library(dplyr)
summarise_each(group_by(input, name = sub(".$", "", name)), funs(sum))
# Source: local data frame [2 x 4]
#
#     name sample1 sample2 sample3
#    (chr)   (int)   (int)   (int)
# 1 pr_001     533     411     633
# 2 pr_002     478     447     427

答案 1 :(得分:0)

选项plyr

library(plyr)
input <- 'name sample1 sample2 sample3
          pr_001A  300  200    300
          pr_001B  233  211   333
          pr_002A  244  214  214  
          pr_002B  234  233  213'
input <- read.table(text=input, header=T)

input$aux_name = gsub("[A-Z]$","",input$name)
result = ddply(input, .(aux_name),summarize, sum(sample1), sum(sample2),sum(sample3))
View(result)

    aux_name    ..1 ..2 ..3
1   pr_001  533 411 633
2   pr_002  478 447 427