根据其他列获取data.table行之间的差异

时间:2017-06-25 14:08:39

标签: r data.table aggregate

我有一个数据框postal_town,如下所示。如何使用data有效地使用c1c2v1v3列的类v4data.table的值进行区分到index

data <- data.frame(index = c("A", "B", "C", "D", "C", "A", "D", "B"),
                   class = c(rep("c1", 4), rep("c2", 4)),
                   v1 = c(21,85,74,96,55,77,21,34),
                   v3 = c(77,41,91,85,22,74,36,41),
                   v4 = c(41,58,24,36,84,24,74,11))

setDT(data)
data[, index := as.factor(index)]
data[, class := as.factor(class)]

   index class v1 v3 v4
1:     A    c1 21 77 41
2:     B    c1 85 41 58
3:     C    c1 74 91 24
4:     D    c1 96 85 36
5:     C    c2 55 22 84
6:     A    c2 77 74 24
7:     D    c2 21 36 74
8:     B    c2 34 41 11

所需的输出是

out <- data.frame(index = data[1:4]$index,
           v1 = data[1:4]$v1 - data[5:8]$v1,
           v3 = data[1:4]$v3 - data[5:8]$v3,
           v4= data[1:4]$v4 - data[5:8]$v4)
out
  index  v1 v3  v4
1     A -56  3  17
2     B  51  0  47
3     C  19 69 -60
4     D  75 49 -38

1 个答案:

答案 0 :(得分:4)

您可以将每个索引的diff()函数应用于数值列:

data[,lapply(.SD,diff),by=index,.SDcols = v1:v4]
#   index  v1  v3  v4
#1:     A  56  -3 -17
#2:     B -51   0 -47
#3:     C -19 -69  60
#4:     D -75 -49  38

要翻转标记(与预期输出一样),您只需使用-.SD代替.SD