我的数据类似于以下
# View date value1 Value2
# a 2012-10-01 21.01 2.00
# b 2012-10-01 22.04 3.03
# c 2012-10-01 22.65 7.61
# a 2012-11-01 23.11 8.46
# b 2012-11-01 35.21 9.00
# c 2012-11-01 35.21 9.00
structure(list(View = c("a", "b", "c", "a", "b", "c"), date = c("2012-10-01",
"2012-10-01", "2012-10-01", "2012-11-01", "2012-11-01", "2012-11-01"
), value1 = c(21.01, 22.04, 22.65, 23.11, 35.21, 35.21), Value2 = c(2,
3.03, 7.61, 8.46, 9, 9)), .Names = c("View", "date", "value1",
"Value2"), row.names = c(NA, -6L), class = "data.frame")
我想创建一个新视图" D"这是" a"的减法。来自" c"对于任何给定的日期。最终得到一个看起来像这样的数据集?
# View date value1 Value2
# a 2012-10-01 21.01 2.00
# b 2012-10-01 22.04 3.03
# c 2012-10-01 22.65 7.61
# D 2012-10-01 1.61 5.61
# a 2012-11-01 23.11 8.46
# b 2012-11-01 35.21 9.00
# c 2012-11-01 35.21 9.00
# D 2012-10-01 12.1 0.54
我对R有所了解,但我不知道如何处理这个问题。任何建议都将不胜感激。
答案 0 :(得分:1)
rbind
对您的data.table进行分组后,您可以.SD
date
新的计算行(来自唯一日期的子数据表)。
df[, rbind(.SD,
.(View = "D", value1 = value1[View == "c"] - value1[View == "a"],
Value2 = Value2[View == "c"] - Value2[View == "a"])), date]
# date View value1 Value2
#1: 2012-10-01 a 21.01 2.00
#2: 2012-10-01 b 22.04 3.03
#3: 2012-10-01 c 22.65 7.61
#4: 2012-10-01 D 1.64 5.61
#5: 2012-11-01 a 23.11 8.46
#6: 2012-11-01 b 35.21 9.00
#7: 2012-11-01 c 35.21 9.00
#8: 2012-11-01 D 12.10 0.54
要避免对列名进行硬编码,但仍假设您要操作date
和View
列:
# drop View column so that you can do subtraction
df[, rbind(.SD, { dt = .SD[, !"View", with = F];
# subtract row c and row a and assign a new View column as D
(dt[View == "c"] - dt[View == "a"])[, View := "D"][] }), date]
# date View value1 Value2
#1: 2012-10-01 a 21.01 2.00
#2: 2012-10-01 b 22.04 3.03
#3: 2012-10-01 c 22.65 7.61
#4: 2012-10-01 D 1.64 5.61
#5: 2012-11-01 a 23.11 8.46
#6: 2012-11-01 b 35.21 9.00
#7: 2012-11-01 c 35.21 9.00
#8: 2012-11-01 D 12.10 0.54