如果日期介于StartDate和EndDate之间,我想加入两个数据集和第一个数据集,1加上coulmn名称pros_sales。以下是我的数据集
data1=data.frame(Date=as.Date("2015-06-28"),Storecode=34)
data2=data.frame(Promo=c("Promo1","Promo2","Promo3","Promo4")
,StartDate=c("2015-02-10","2015-03-15"," 2015-05-24","2015-06-21")
,EndDate=c("2015-02-17","2015-03-22","2015-06-01","2015-06-28"))
data1$Date <- as.Date(data1$Date)
data2$StartDate <- as.Date(data2$StartDate)
data2$EndDate <- as.Date(data2$EndDate)
我想要的数据集如下;
Date Storecode Promo StartDate EndDate pro_sales
2015-06-28 34 Promo1 2015-02-10 2015-02-17 0
2015-06-28 34 Promo2 2015-03-15 2015-03-22 0
2015-06-28 34 Promo3 2015-05-24 2015-06-01 0
2015-06-28 34 Promo4 2015-06-21 2015-06-28 1
以下是我的代码,但它为所有pro_sales添加了1。
data.frame( data1 %>%
mutate(dummy=TRUE) %>%
left_join(data2 %>% mutate(dummy=TRUE)) %>%
mutate(pro_sales = case_when(
Date>=StartDate | Date<=EndDate ~ 1,
FALSE ~ 0)))
你能帮帮我吗?谢谢。
答案 0 :(得分:0)
这个怎么样?
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
data1=data.frame(Date=as.Date("2015-06-28"), Storecode=34, tmp = 1)
data2=data.frame(Promo=c("Promo1","Promo2","Promo3","Promo4"),
StartDate=c("2015-02-10","2015-03-15"," 2015-05-24","2015-06-21"),
EndDate=c("2015-02-17","2015-03-22","2015-06-01","2015-06-28"),
tmp = 1)
data1$Date <- as.Date(data1$Date)
data2$StartDate <- as.Date(data2$StartDate)
data2$EndDate <- as.Date(data2$EndDate)
full_join(data1, data2, by = "tmp") %>%
select(-tmp) %>%
mutate(pro_sales = case_when(
Date>=StartDate & Date<=EndDate ~ 1,
TRUE ~ 0))
#> Date Storecode Promo StartDate EndDate pro_sales
#> 1 2015-06-28 34 Promo1 2015-02-10 2015-02-17 0
#> 2 2015-06-28 34 Promo2 2015-03-15 2015-03-22 0
#> 3 2015-06-28 34 Promo3 2015-05-24 2015-06-01 0
#> 4 2015-06-28 34 Promo4 2015-06-21 2015-06-28 1
由reprex包(v0.2.0)于2018-05-26创建。