在R中:根据日期和其他条件添加行

时间:2015-02-20 21:21:19

标签: r dataframe

我有一个数据框df:

df <- data.frame(names=c("john","mary","tom"),dates=c(as.Date("2010-06-01"),as.Date("2010-07-09"),as.Date("2010-06-01")),tours_missed=c(2,12,6))

names   dates       tours_missed
john    2010-06-01  2
mary    2010-07-09  12
tom     2010-06-01  6

我希望能够添加一行,其中包含错过的人的日期。这个人每天有2个旅行团。每个人每4天工作一次。

结果应该是(尽管顺序并不重要):

names   dates       tours_missed
john    2010-06-01  2
mary    2010-07-09  12
mary    2010-07-13  12
mary    2010-07-17  12
mary    2010-07-21  12
mary    2010-07-25  12
mary    2010-07-29  12
tom     2010-06-01  6
tom     2010-06-05  6
tom     2010-06-09  6

我已尝试查看这些主题,但无法产生上述结果:Add rows to a data frame based on date in previous rowIn R: Add rows with data of previous row to data frameadd new row to dataframeenter link description here。谢谢你的帮助!

1 个答案:

答案 0 :(得分:3)

library(data.table)
dt = as.data.table(df) # or convert in-place using setDT

# all of the relevant dates
dates.all = dt[, seq(dates, length = tours_missed/2, by = "4 days"), by = names]

# set the key and merge filling in the blanks with previous observation
setkey(dt, names, dates)
dt[dates.all, roll = T]
#    names      dates tours_missed
# 1:  john 2010-06-01            2
# 2:  mary 2010-07-09           12
# 3:  mary 2010-07-13           12
# 4:  mary 2010-07-17           12
# 5:  mary 2010-07-21           12
# 6:  mary 2010-07-25           12
# 7:  mary 2010-07-29           12
# 8:   tom 2010-06-01            6
# 9:   tom 2010-06-05            6
#10:   tom 2010-06-09            6

或者如果合并是不必要的(OP中不太清楚),只需构建答案:

dt[, list(dates = seq(dates, length = tours_missed/2, by = "4 days"), tours_missed)
   , by = names]