计算两个日期之间的工作日

时间:2016-08-31 23:03:24

标签: r

R的新手和学习很多,我喜欢它。

我的销售数据集包含product,bpid,date。我想计算相同bpid之间的业务差异(仅排除星期六和星期日)。

如果产品或bpid发生变化(或引入新的bpid /产品),我们需要跳过计算。

    df <- data.frame(product=c('milk','milk','milk','milk','eggs','eggs','eggs','eggs'), 
                     bpid=c(400,400,500,500,400,400,500,500),
                     date=c("2016-08-03","2016-08-10","2016-08-04","2016-08-10","2016-08-10","2016-08-16","2016-08-11","2016-08-15"));

df$date <- as.Date(df$date, format = "%Y-%m-%d");

我想要的结果如下。 请帮助....

 product bpid       date    compute-result
   milk  400 2016-08-03      0
   milk  400 2016-08-10      5
   milk  500 2016-08-04      0
   milk  500 2016-08-10      5
   eggs  400 2016-08-10      0
   eggs  400 2016-08-16      4
   eggs  500 2016-08-11      0
   eggs  500 2016-08-15      2

真实数据代码(在结果列中获得零)

df <- data.frame(product=c('Keyt','Keyt','Keyt','Keyt','Keyt','Keyt'), 
                 bpid=c(30057,30057,30057,30058,30058,30058),
                 date=c("2014-11-21","2015-05-05","2015-05-11","2014-10-16","2014-11-03","2016-03-15"));

df$date <- as.Date(df$date, format = "%Y-%m-%d");

cal <-  Calendar(weekdays=c("saturday", "sunday"))
df$`compute-result` <- 0
idx <- seq(1, nrow(df),2)
df$`compute-result`[idx+1] <- bizdays(df$date[idx], df$date[idx+1], cal)
df

1 个答案:

答案 0 :(得分:6)

例如:

# install.packages("bizdays")
library(bizdays)
cal <-  create.calendar(name = "mycal", weekdays=c("saturday", "sunday"))
df$`compute-result` <- 0
idx <- seq(1, nrow(df),2)
df$`compute-result`[idx+1] <- bizdays(df$date[idx], df$date[idx+1], cal)
df
#   product bpid       date compute-result
# 1    milk  400 2016-08-03              0
# 2    milk  400 2016-08-10              5
# 3    milk  500 2016-08-04              0
# 4    milk  500 2016-08-10              4
# 5    eggs  400 2016-08-10              0
# 6    eggs  400 2016-08-16              4
# 7    eggs  500 2016-08-11              0
# 8    eggs  500 2016-08-15              2

如果您想按productbpid分组,可以尝试

# install.packages("bizdays")
library(bizdays)
cal <-  create.calendar(name="mycal", weekdays=c("saturday", "sunday"))
with(df, ave(as.integer(date), product, bpid, FUN=function(x) { 
  x <- as.Date(x, origin="1970-01-01") 
  c(0, bizdays(head(x, -1), tail(x, -1), cal)) 
})) -> df$result
df
#   product  bpid       date result
# 1    Keyt 30057 2014-11-21      0
# 2    Keyt 30057 2015-05-05    117
# 3    Keyt 30057 2015-05-11      4
# 4    Keyt 30058 2014-10-16      0
# 5    Keyt 30058 2014-11-03     12
# 6    Keyt 30058 2016-03-15    356

请注意,该功能会将date转换为integer并返回Date,否则ave会抛出错误:

Error in as.Date.numeric(value) : 'origin' must be supplied
and I dunno how to supply that origin argument here.