我有一个df,包含某个过程的开始日期和结束日期。复制数据:
ID1 <- c("AUT","AUT","AUT","BEL","BEL","BEL")
start_date <- c("2008-12-02", "2013-12-16", "2016-05-17", "2007-06-10", "2007-12-21", "2008-03-20")
start_date <- as.Date(start_date, "%Y-%m-%d")
end_date <- c("2013-12-15", "2016-05-16", "2017-11-30", "2007-12-20", "2008-03-19", "2008-12-29")
end_date <- as.Date(end_date, "%Y-%m-%d")
ID2 <- 1:6
df <- data.frame(ID1, as.character(start_date), as.character(end_date), ID2)
看起来像:
ID1 start_date end_date ID2
1 AUT 2008-12-02 2013-12-15 1
2 AUT 2013-12-16 2016-05-16 2
3 AUT 2016-05-17 2017-11-30 3
4 BEL 2007-06-10 2007-12-20 4
5 BEL 2007-12-21 2008-03-19 5
6 BEL 2008-03-20 2008-12-29 6
我想在给定时间段内为此df添加新的列和行:列应为year
,其值取决于是否在一年的第一天(20XX-01) -01)在进程边界内部或外部。我想看到的是:
ID1 start_date end_date ID2 year
1 AUT 2008-12-02 2013-12-15 1 2009
2 AUT 2008-12-02 2013-12-15 1 2010
2 AUT 2008-12-02 2013-12-15 1 2011
4 AUT 2008-12-02 2013-12-15 1 2012
5 AUT 2008-12-02 2013-12-15 1 2013
6 AUT 2013-12-16 2016-05-16 2 2014
7 AUT 2013-12-16 2016-05-16 2 2015
8 AUT 2013-12-16 2016-05-16 2 2016
9 AUT 2016-05-17 2017-11-30 3 2017
10 BEL 2007-06-10 2007-12-20 4 NA
11 BEL 2007-12-21 2008-03-19 5 2008
12 BEL 2008-03-20 2008-12-29 6 NA
编辑:较小的建议更改以提高清晰度
答案 0 :(得分:1)
所以我稍微修改了初始代码 :
library(lubridate)
library(data.table)
ID1 <- c("AUT","AUT","AUT","BEL","BEL","BEL")
start_date <- ymd("2008-12-02", "2013-12-16", "2016-05-17", "2007-06-10", "2007-12-21", "2008-03-20")
end_date <- ymd("2013-12-15", "2016-05-16", "2017-11-30", "2007-12-20", "2008-03-19", "2008-12-29")
ID2 <- 1:6
df <- data.table(ID1, start_date, end_date, ID2)
df[,yearDiff:=year(end_date)-year(start_date)]
df<-df[,cbind(.SD,year=(year(start_date)+1):year(end_date)),by="ID2"]
df[,dateInterval:=interval(df$start_date,df$end_date)]
df[,IsYearWithinDate:=(ymd(paste0(year,"01","01",sep="-"))%within% dateInterval)]
df[,.(ID1,start_date,end_date,ID2,year,IsYearWithinDate)]
导致(通过df[,.(ID1,start_date,end_date,ID2,year
):
ID1 start_date end_date ID2 year
1: AUT 2008-12-02 2013-12-15 1 2009
2: AUT 2008-12-02 2013-12-15 1 2010
3: AUT 2008-12-02 2013-12-15 1 2011
4: AUT 2008-12-02 2013-12-15 1 2012
5: AUT 2008-12-02 2013-12-15 1 2013
6: AUT 2013-12-16 2016-05-16 2 2014
7: AUT 2013-12-16 2016-05-16 2 2015
8: AUT 2013-12-16 2016-05-16 2 2016
9: AUT 2016-05-17 2017-11-30 3 2017
10: BEL 2007-06-10 2007-12-20 4 2008
11: BEL 2007-06-10 2007-12-20 4 2007
12: BEL 2007-12-21 2008-03-19 5 2008
13: BEL 2008-03-20 2008-12-29 6 2009
14: BEL 2008-03-20 2008-12-29 6 2008