我有一个数据框:
ID_1 <- c("A","B","C","D","A","A","B","E","D","F","H")
ID_2 <- c("G","D","I","A","J","B","K","D","A","H","A")
Value <- c(10,9,15,27,3,28,4,3,11,19,12)
DF <- as.data.frame(cbind(ID_1, ID_2, Value))
我想要一个新列,其中包含基于相应ID的给定ID(&#39; ID_1&#39;)的最后一个(即,前一个)值(&#39;值&#39;)另一列中的ID(&#39; ID_2&#39;)。换句话说:预期的解决方案应该找到给定ID(&#39; ID_1&#39;)的最新/最后ID条目(&#39; ID_2&#39;)并提取相应的值(&#39;值& #39;)在新专栏中。
最终数据集应该如下所示(一个新列添加到现有的三列;插图):
NEW <- c(NA,NA,NA,9,27,27,28,NA,3,NA,19)
DF_NEW <- as.data.frame(cbind(ID_1, ID_2, Value, NEW))
提前感谢您的帮助!
答案 0 :(得分:1)
一种选择是在DF上创建行号列,然后使用data.table
滚动连接:
library(data.table)
setDT(DF)[, rn := seq_len(.N)]
DF[DF,
on=.(ID_2 = ID_1, rn = rn),
.(ID_1 = i.ID_1, ID_2 = i.ID_2, Value = i.Value, New = x.Value),
roll=Inf
]
# ID_1 ID_2 Value New
# 1: A G 10 NA
# 2: B D 9 NA
# 3: C I 15 NA
# 4: D A 27 9
# 5: A J 3 27
# 6: A B 28 27
# 7: B K 4 28
# 8: E D 3 NA
# 9: D A 11 3
#10: F H 19 NA
#11: H A 12 19