我有一个数据框df:
df <- data.frame(names=c("john","mary","tom"),dates=c(as.Date("2010-06-01"),as.Date("2010-07-09"),as.Date("2010-06-01")),tours_missed=c(2,12,6))
names dates tours_missed
john 2010-06-01 2
mary 2010-07-09 12
tom 2010-06-01 6
我希望能够添加一行,其中包含错过的人的日期。这个人每天有2个旅行团。每个人每4天工作一次。
结果应该是(尽管顺序并不重要):
names dates tours_missed
john 2010-06-01 2
mary 2010-07-09 12
mary 2010-07-13 12
mary 2010-07-17 12
mary 2010-07-21 12
mary 2010-07-25 12
mary 2010-07-29 12
tom 2010-06-01 6
tom 2010-06-05 6
tom 2010-06-09 6
我已尝试查看这些主题,但无法产生上述结果:Add rows to a data frame based on date in previous row,In R: Add rows with data of previous row to data frame,add new row to dataframe,enter link description here。谢谢你的帮助!
答案 0 :(得分:3)
library(data.table)
dt = as.data.table(df) # or convert in-place using setDT
# all of the relevant dates
dates.all = dt[, seq(dates, length = tours_missed/2, by = "4 days"), by = names]
# set the key and merge filling in the blanks with previous observation
setkey(dt, names, dates)
dt[dates.all, roll = T]
# names dates tours_missed
# 1: john 2010-06-01 2
# 2: mary 2010-07-09 12
# 3: mary 2010-07-13 12
# 4: mary 2010-07-17 12
# 5: mary 2010-07-21 12
# 6: mary 2010-07-25 12
# 7: mary 2010-07-29 12
# 8: tom 2010-06-01 6
# 9: tom 2010-06-05 6
#10: tom 2010-06-09 6
或者如果合并是不必要的(OP中不太清楚),只需构建答案:
dt[, list(dates = seq(dates, length = tours_missed/2, by = "4 days"), tours_missed)
, by = names]