我想在每组(grp)之后插入新行,并用下一行值填充某些新列,并用上一行值填充一些新列
我正在尝试使用:
x<-rbind(setDT(DF), DF[,.SD[.N], grp][, color := shift(color,1L, type = "lag")][, Lat:= shift(Lat,1L, type = "lead")])[order(id)]
在DF上:
a <- c(1,2,3,4,5,6,7,8,9,10)
b <- c(10,20,30,40,50,60,70,80,90,100)
c <- c("a","a","b","b","b","a","a","b","c","c")
d <- c(11,23,67,89,90,100,101,123,200,290)
df <- data.frame(color=a, Lat=b, grp=c, id=d)
我可能使用了不正确的shift(),似乎无法正常工作。
谢谢
预期结果为:
color Lat grp id
1 1 10 a 11
2 2 20 a 23
new row with color from previous row, Lat from next row, grp from previous and if from next
3 3 30 b 67
4 4 40 b 89
5 5 50 b 90
new row as before
6 6 60 a 100
7 7 70 a 101
new row as before
8 8 80 b 123
new row as before
9 9 90 c 200
10 10 100 c 290
答案 0 :(得分:0)
我们创建两个lead
列,即'Lat'和'id'的下一行值,然后再对{grp'的rleid
进行分组。这里,rleid
正在检查'grp'的相邻元素是否相同。如果不同,它将为该元素分配一个新的ID
library(data.table)
setDT(df)[, c("LatN", "idN") := shift(.SD, type = 'lead'), .SDcols = c('Lat', 'id')]
通过'grp'的last
获得对所选列的rleid
观察
tmp <- df[, .(color = color[.N], Lat = LatN[.N], id = idN[.N], grp = grp[.N]),
.(grp1 = rleid(grp))][, grp1 := NULL]
rbind
包含原始数据集列,order
带有“颜色”,并删除任何NA
行
na.omit(rbind(df[,.(color, Lat, id, grp)], tmp)[order(color)])
# color Lat id grp
# 1: 1 10 11 a
# 2: 2 20 23 a
# 3: 2 30 67 a
# 4: 3 30 67 b
# 5: 4 40 89 b
# 6: 5 50 90 b
# 7: 5 60 100 b
# 8: 6 60 100 a
# 9: 7 70 101 a
#10: 7 80 123 a
#11: 8 80 123 b
#12: 8 90 200 b
#13: 9 90 200 c
#14: 10 100 290 c