我有一个数据框postal_town
,如下所示。如何使用data
有效地使用c1
对c2
,v1
和v3
列的类v4
和data.table
的值进行区分到index
?
data <- data.frame(index = c("A", "B", "C", "D", "C", "A", "D", "B"),
class = c(rep("c1", 4), rep("c2", 4)),
v1 = c(21,85,74,96,55,77,21,34),
v3 = c(77,41,91,85,22,74,36,41),
v4 = c(41,58,24,36,84,24,74,11))
setDT(data)
data[, index := as.factor(index)]
data[, class := as.factor(class)]
index class v1 v3 v4
1: A c1 21 77 41
2: B c1 85 41 58
3: C c1 74 91 24
4: D c1 96 85 36
5: C c2 55 22 84
6: A c2 77 74 24
7: D c2 21 36 74
8: B c2 34 41 11
所需的输出是
out <- data.frame(index = data[1:4]$index,
v1 = data[1:4]$v1 - data[5:8]$v1,
v3 = data[1:4]$v3 - data[5:8]$v3,
v4= data[1:4]$v4 - data[5:8]$v4)
out
index v1 v3 v4
1 A -56 3 17
2 B 51 0 47
3 C 19 69 -60
4 D 75 49 -38
答案 0 :(得分:4)
您可以将每个索引的diff()
函数应用于数值列:
data[,lapply(.SD,diff),by=index,.SDcols = v1:v4]
# index v1 v3 v4
#1: A 56 -3 -17
#2: B -51 0 -47
#3: C -19 -69 60
#4: D -75 -49 38
要翻转标记(与预期输出一样),您只需使用-.SD
代替.SD