考虑样本data.table
,
dt = data.table(A = c(1:5), B = c("a","b","c","a","b"))
我希望根据列#34; B" sum
列{A},但要使用c("a","b")
和"c"
的子组。即,输出应该看起来像
c("a","b")= 12
"c" = 3
答案 0 :(得分:2)
您可以将B
转换为系数,然后更改级别以执行此操作:
#convert B to factor
dt[, B := factor(B)]
#change levels to ab and c
levels(dt$B) <- c('ab', 'ab', 'c')
#group and sum
dt[, sum(A), by = B]
# B V1
#1: ab 12
#2: c 3
或者根据@akrun的评论,您可以这样做:
dt[, .(B = paste(unique(B), collapse=""), A = sum(A)),
.(grp = B %in% c('a', 'b'))][, grp := NULL][]
或者根据@Frank的评论:
mDT = unique(dt[, "B"])[, g := B][B %in% c("a","b"), g := "ab"]
dt[mDT, on=.(B)][, sum(A), by=g]