基于来自datatable r中另一列的子组的列的总和

时间:2017-07-22 17:24:24

标签: r data.table

考虑样本data.table

dt = data.table(A = c(1:5), B = c("a","b","c","a","b"))

我希望根据列#34; B" sum列{A},但要使用c("a","b")"c"的子组。即,输出应该看起来像

c("a","b")= 12
"c"   =     3 

1 个答案:

答案 0 :(得分:2)

您可以将B转换为系数,然后更改级别以执行此操作:

#convert B to factor
dt[, B := factor(B)]
#change levels to ab and c
levels(dt$B) <- c('ab', 'ab', 'c')
#group and sum
dt[, sum(A), by = B]
#    B V1
#1: ab 12
#2:  c  3

或者根据@akrun的评论,您可以这样做:

dt[, .(B = paste(unique(B), collapse=""), A = sum(A)), 
   .(grp = B %in% c('a', 'b'))][, grp := NULL][]

或者根据@Frank的评论:

mDT = unique(dt[, "B"])[, g := B][B %in% c("a","b"), g := "ab"]
dt[mDT, on=.(B)][, sum(A), by=g]