我有以下数据框:
id<-c(1,1,1,1,1,3,3,3,3)
period<-c("calib","calib","calib","valid","valid","calib","calib","calib","valid")
date<-c("11-11-07","11-11-07","23-11-07","12-12-08","17-12-08","11-11-07","23-11-07","23-11-07","16-01-08")
time<-c(12,13,14,11,23,15,12,18,14)
df<-data.frame(id,period,time,date)
df$date2<-as.Date(as.character(df$date), format = "%d-%m-%y")
id period time date date2
1 calib 12 11-11-07 2007-11-11
1 calib 13 11-11-07 2007-11-11
1 calib 14 23-11-07 2007-11-23
1 valid 11 12-12-08 2008-12-12
1 valid 23 17-12-08 2008-12-17
3 calib 15 11-11-07 2007-11-11
3 calib 12 23-11-07 2007-11-23
3 calib 18 23-11-07 2007-11-23
3 valid 14 16-01-08 2008-01-16
我需要在date
期间为每个calib
提取最后一笔交易的id
,并将其放入新列中。如果在一天内完成了两笔交易(类似date
),则应根据交易时间选择最后一笔交易。
我要找的决赛桌如下:
id period time date date2 last
1 calib 12 11-11-07 2007-11-11 NA
1 calib 13 11-11-07 2007-11-11 NA
1 calib 14 23-11-07 2007-11-23 2007-11-23
1 valid 11 12-12-08 2008-12-12 NA
1 valid 23 17-12-08 2008-12-17 NA
3 calib 15 11-11-07 2007-11-11 NA
3 calib 12 23-11-07 2007-11-23 NA
3 calib 18 23-11-07 2007-11-23 2007-11-23
3 valid 14 16-01-08 2008-01-16 NA
有人可以帮帮我吗?!
答案 0 :(得分:1)
我可以通过rle
来解决问题:
L1 <- lapply(split(df, df[, "id"]), function(dat){
dat[, "last"] <- as.Date(NA)
x <- rle(as.character(dat[, "period"]))
z <- cumsum(x[["lengths"]])
dat$last[z[x[["values"]] == "calib"]] <- dat[z[x[["values"]] == "calib"] ,
"date2"]
dat
})
data.frame(do.call(rbind, L1), row.names = NULL)