不同州的假期匹配情况

时间:2018-11-09 16:34:18

标签: r

目前,我有很多带有日期的发票,但是它们来自不同的州。我想设置一个假期指示器,以检查发票日期是否为相应状态的假期。

例如,我有如下的表A和B,如果表A中的日期是对应于表B中该状态的假日的假日,则holidayIndi​​cator的列应设置为1,否则设置为0。返回应该是holidayIndi​​cator列中值为0或1的完整表A。

Table A:
date    state   holidayIndicator
1/1/2018    E   0
2/1/2018    F   0
3/1/2018    G   0
4/1/2018    E   0
5/1/2018    F   0
6/1/2018    G   0

Table B
State   Holiday
E   1/1/2018
E   3/1/2018
E   3/28/2018
F   5/26/2018
F   6/2/2018
F   7/1/2018
G   9/1/2018
G   6/1/2018
G   5/29/2018

结果应如下所示

date    state   holidayIndicator
1/1/2018    E   1
2/1/2018    F   0
3/1/2018    G   0
4/1/2018    E   0
5/1/2018    F   0
6/1/2018    G   1

3 个答案:

答案 0 :(得分:1)

假设两个表分别是df1和df2

df1 $ holidayIndi​​cator [interaction(df1 [,c('date','state')])%in%交互作用(df2 [,c('Holiday','State')])]]-1 >

答案 1 :(得分:0)

我对R不太熟悉,但是想知道您是否可以使用“ bizdays”之类的软件包/库来确定给定的日期是否是假期。

https://cran.r-project.org/web/packages/bizdays/bizdays.pdf

答案 2 :(得分:0)

基于纯data.frame的解决方案(不使用软件包dplyrdata.table看起来像这样:

a <- read.table(text = "date    state   holidayIndicator
1/1/2018    E   0
2/1/2018    F   0
3/1/2018    G   0
4/1/2018    E   0
5/1/2018    F   0
6/1/2018    G   0", header = TRUE, stringsAsFactors = FALSE)

b <- read.table(text = "State   Holiday
E   1/1/2018
E   3/1/2018
E   3/28/2018
F   5/26/2018
F   6/2/2018
F   7/1/2018
G   9/1/2018
G   6/1/2018
G   5/29/2018", header = TRUE, stringsAsFactors = FALSE)

b$isHoliday <- 1   # add a helper column (auto-fills all rows with the same value)

# "inner join" similar to SQL to "enrich" the helper column value
res <- merge(a, b, by.x = c("date", "state"), by.y = c("Holiday", "State"), all.x = TRUE)

res$holidayIndicator[res$isHoliday == 1] <- 1  # mark the holidays using the enriched helper column

# Optionally: Remove the helper column from the result
res$isHoliday <- NULL