如何根据日期条件将交易数量从一个数据框合并到另一个数据框

时间:2020-02-07 19:37:43

标签: r dplyr data.table

我已经搜寻了几天,但无法找到适合我情况的答案。

我有两个数据框:skus:

skus <- data.frame("SKU_ID"= c(123,234,345,135,135,234),
                   "date1"=c(as.Date("2020-01-10"),as.Date("2020-01-05"),as.Date("2020-01-11"),as.Date("2020-01-10"),
                    as.Date("2020-01-15"),as.Date("2020-01-15")),
                      "date2"=c(as.Date("2020-01-20"),as.Date("2020-01-08"),as.Date("2020-01-15"),as.Date("2020-01-12"),
                                as.Date("2020-01-20"),as.Date("2020-01-25")))

和:交易:

transactions <- data.frame("SKU_ID"=c(123,123,123,234,234,234,234,234,345,345,135,135,135,135),
                           "date"=c(as.Date("2020-01-02"),as.Date("2020-01-11"),as.Date("2020-01-17"),as.Date("2020-01-05"),
                                     as.Date("2020-01-07"),as.Date("2020-01-09"),as.Date("2020-01-16"),as.Date("2020-01-17"),
                                     as.Date("2020-01-12"),as.Date("2020-01-12"),as.Date("2020-01-11"),as.Date("2020-01-12"),as.Date("2020-01-18"),
                                     as.Date("2020-01-20")),
                           "qty"=c(2,3,6,1,9,2,9,34,1,23,12,18,21,62))

我正在尝试获得以下输出:

desiredOutput <- data.frame("SKU_ID"= c(123,234,345,135,135,234),
                   "date1"=c(as.Date("2020-01-10"),as.Date("2020-01-05"),as.Date("2020-01-11"),as.Date("2020-01-10"),
                             as.Date("2020-01-15"),as.Date("2020-01-15")),
                   "date2"=c(as.Date("2020-01-20"),as.Date("2020-01-08"),as.Date("2020-01-15"),as.Date("2020-01-12"),
                             as.Date("2020-01-20"),as.Date("2020-01-25")),
                   "qty"=c(9,10,24,30,83,43))

我尝试了sqldf,dplyr和data.table解决方案,但没有什么能给我我想要的结果。

有什么建议吗?

2 个答案:

答案 0 :(得分:1)

这是dplyr解决方案。

skus %>% 
  inner_join(transactions, by=c("SKU_ID"), suffix = c(".a", ".b")) %>% 
  filter(date1 <= date & date2 >= date) %>% 
  group_by(SKU_ID, date1, date2) %>% 
  summarise(qty = sum(qty)) %>% 
  ungroup() 

# A tibble: 6 x 4
  SKU_ID date1      date2        qty
   <dbl> <date>     <date>     <dbl>
1    123 2020-01-10 2020-01-20     9
2    135 2020-01-10 2020-01-12    30
3    135 2020-01-15 2020-01-20    83
4    234 2020-01-05 2020-01-08    10
5    234 2020-01-15 2020-01-25    43
6    345 2020-01-11 2020-01-15    24

答案 1 :(得分:1)

或在data.table中:

library(data.table)
setDT(skus)[, qty := 
    setDT(transactions)[.SD, on=.(SKU_ID, date>=date1, date<=date2), 
        by=.EACHI, sum(qty)]$V1
]