根据标准查找日期之间的差异

时间:2017-06-03 06:00:01

标签: r

考虑这个数据集:

mydf <- data.frame(churn_indicator = c(0,0,1,0,1), resign_date = c(NA,NA,"2011-01-01",NA,"2012-02-01"), join_date = c("2001-01-01","2001-03-01","2002-04-02", "2003-09-01","2005-05-10"))

任务是计算一个矢量'length',它是resign_date - 用于churn_indicator = 1的join_date和用于churn_indicator = 0的Sys.Date() - join_date。

我已经想出了如何使用for循环来做到这一点,但我想使用效率更高的东西(也许是应用系列)。另外,是否可以使用dplyr的mutate函数执行此操作?

1 个答案:

答案 0 :(得分:0)

可能的解决方案:

# convert column from factor/characters to Date (if not already done)
mydf$resign_date <- as.Date(mydf$resign_date)
mydf$join_date <- as.Date(mydf$join_date)

# compute the date differences
days_churn1 <- as.numeric(difftime(mydf$resign_date,mydf$join_date,units='days'))
days_churn0 <- as.numeric(difftime(Sys.Date(),mydf$join_date,units='days'))

# set to zero the values where churn indicator is not what we want
days_churn1[mydf$churn_indicator==0]<-0
days_churn0[mydf$churn_indicator==1]<-0

# sum the two vectors
mydf$length <- days_churn1+days_churn0

> mydf
  churn_indicator resign_date  join_date length
1               0        <NA> 2001-01-01   5997
2               0        <NA> 2001-03-01   5938
3               1  2011-01-01 2002-04-02   3196
4               0        <NA> 2003-09-01   5024
5               1  2012-02-01 2005-05-10   2458

或者,您可以使用ifelse组合一些操作:

# convert column from factor/characters to Date  (if not already done)
mydf$resign_date <- as.Date(mydf$resign_date)
mydf$join_date <- as.Date(mydf$join_date)

mydf$length <- 
as.numeric(
  ifelse(mydf$churn_indicator==1,
         difftime(mydf$resign_date,mydf$join_date,units='days'),
         difftime(Sys.Date(),mydf$join_date,units='days')
))