返回自截止日期以来的工作日数作为整数,并使用dplyr添加工作日

时间:2017-11-08 01:50:05

标签: r dplyr

我想要一个函数来返回自特定日期以来的工作日数,并将工作日添加到某个日期,这是因为NAs 但是,我的解决方案是草率的,应该有一个更优雅的方式。

library(dplyr)
library(timeDate)
library(RQuantLib)
library(lubridate)


item <- c("a", "b")
date1 = as.Date(c("2017-11-30", "2017-11-01"))
date2 = as.Date(c("2017-12-01", "2017-11-16"))
d <- data.frame(item, date1, date2, stringsAsFactors=F)

line3 <- c("c", "2017-12-03", NA)
line4 <- c("d", NA, "2017-12-03")
d <- rbind(d, line3, line4)

此功能有效,但在多个项目中运行速度非常慢,也不太清晰。

bizDeadline <- function(x, nBizDys = 10) {
  output <- Reduce(rbind, Map((function(x, howMuch = 15) {
    x <- as.Date(x, origin = "1960-01-01")
    days <- x + 1:(howMuch * 2)
    Deadline <- days[isBizday(as.timeDate(days))][howMuch]
    data.frame(DateIn = x, Deadline, DayOfWeek = weekdays(Deadline), TimeDiff = difftime(Deadline,
                                                                                         x))
  }), x, howMuch = nBizDys))
  output$Deadline
}

这是排除假期和周末的理想选择。

d %>% mutate(deadline = bizDeadline(date1, 10))

d$DaysOverdue <- NA

这适用于循环:但不适用于矢量化的变异。

i = 1
for(i in 1:nrow(d)){
  d$DaysOverdue[i] = businessDaysBetween("UnitedStates", d$date1[i], today())
}

RQuantLib中的这个函数似乎不是矢量化的

d %>% mutate(od = businessDaysBetween("UnitedStates", date1, today())

有更好的解决方案吗?

1 个答案:

答案 0 :(得分:1)

所以,我建议你在R中使用Vectorize函数。很容易对某些函数进行向量化。附:此功能无法处理NA

businessDaysBetween_vec <- Vectorize(businessDaysBetween,vectorize.args = c('from', 'to'))

d[1:2,] %>% mutate(od = businessDaysBetween_vec("UnitedStates", date1, today()))


#Checking and comparing speed of solution
foo_loop <- function() {
  for(i in 1:2){
    d$DaysOverdue[i] = businessDaysBetween("UnitedStates", d$date1[i], today())
  } 
}
require(microbenchmark)
require(ggplot2)
res <- microbenchmark(businessDaysBetween_vec(),foo_loop(),times = 1e5)
autoplot(res)

result