我有一个数据表,其中包含以不同频率重复发生事件的地点的位置。提供了上次事件的日期以及发生的频率。
示例:
dt
# Location Last_Occurrence Frequency
# 1: Home 7-19-2018 30
# 2: School 6-6-2018 60
# 3: Moon 1-5-1993 90
我想做的是添加一个新列,其中包括到2018年底每个位置的所有未来活动日期。
所以,我想要一个看起来如下的表:
dt
# Location Last_Occurrence Frequency Next_Dates
# 1: Home 7-19-2018 30 7-19-2018
# 2: Home 7-19-2018 30 8-18-2018
# 3: Home 7-19-2018 30 9-17-2018
# 4: Home 7-19-2018 30 10-17-2018
# 5: Home 7-19-2018 30 11-16-2018
# 6: Home 7-19-2018 30 12-16-2018
# 7: School 6-6-2018 60 6-6-2018
# 8: School 6-6-2018 60 8-5-2018
# 9: School 6-6-2018 60 10-4-2018
etc.
我应该如何去做?我怀疑lapply函数会有用,因为我正在每个位置执行此操作...
我已经弄清楚了如何使用“ while”循环来生成将来日期的向量:
Last_Sample_Date <- Sys.Date() #For testing
increase <- 5 #For testing
NextDate <- Last_Sample_Date+increase
multiplier <- 1
# Create vector of next sampling dates - updated with each iteration of the while loop
NextDates <- c(Last_Sample_Date, NextDate)
while (year(NextDate) == 2018) {
multiplier <- multiplier+1
NextDate <- NextDate+multiplier*increase
#Add to vector of next sampling dates
NextDates <- append(NextDates, NextDate)
})
(我意识到这实际上会生成一个包含2019年最后日期的向量,但是我可以接受)。
我可以以某种方式使用while循环,还是还有其他方法可以解决这个问题?
答案 0 :(得分:1)
我的带有data.table的版本
library(data.table)
# create example dataset
dt <- data.table(
location = c("home", "school", "moon"),
orig_date = as.Date(c("2018-07-19", "2018-06-06", "2015-01-05")),
freq_days = c(30, 60, 90)
)
# figure out how many new rows are needed
dt[ , rows_needed := length(seq(from=orig_date, to=as.Date("2018-12-31"), by=paste(freq_days,"days"))), by=location]
# expand the data.table to include the new rows
dt <- dt[rep(1:nrow(dt), times=rows_needed)]
# add the dates of occurrence
dt[ , date_of_occurrence := seq(from=orig_date[1], to=as.Date("2018-12-31"), by=paste(freq_days[1],"days")), by=location]
# shift dates of occurrence to get next date
dt[ , next_date := shift(date_of_occurrence, type="lead"), by=location]
# drop rows where next occurrence is after 2018 (should you want this)
dt <- dt[!is.na(next_date)]
答案 1 :(得分:0)
IIUC,其中complete
中有tidyr
df %>% group_by(Location,Frequency,Last_Occurrence) %>%
mutate(next_date=Last_Occurrence)%>%
complete(next_date=seq(from = next_date, to = as.Date("2018-12-31"),by = Frequency))
# A tibble: 10 x 4
# Groups: Location, Frequency, Last_Occurrence [2]
Location Frequency Last_Occurrence next_date
<chr> <int> <date> <date>
1 Home 30 2018-07-19 2018-07-19
2 Home 30 2018-07-19 2018-08-18
3 Home 30 2018-07-19 2018-09-17
4 Home 30 2018-07-19 2018-10-17
5 Home 30 2018-07-19 2018-11-16
6 Home 30 2018-07-19 2018-12-16
7 School 60 2018-06-06 2018-06-06
8 School 60 2018-06-06 2018-08-05
9 School 60 2018-06-06 2018-10-04
10 School 60 2018-06-06 2018-12-03