我已经搜寻了几天,但无法找到适合我情况的答案。
我有两个数据框:skus:
skus <- data.frame("SKU_ID"= c(123,234,345,135,135,234),
"date1"=c(as.Date("2020-01-10"),as.Date("2020-01-05"),as.Date("2020-01-11"),as.Date("2020-01-10"),
as.Date("2020-01-15"),as.Date("2020-01-15")),
"date2"=c(as.Date("2020-01-20"),as.Date("2020-01-08"),as.Date("2020-01-15"),as.Date("2020-01-12"),
as.Date("2020-01-20"),as.Date("2020-01-25")))
和:交易:
transactions <- data.frame("SKU_ID"=c(123,123,123,234,234,234,234,234,345,345,135,135,135,135),
"date"=c(as.Date("2020-01-02"),as.Date("2020-01-11"),as.Date("2020-01-17"),as.Date("2020-01-05"),
as.Date("2020-01-07"),as.Date("2020-01-09"),as.Date("2020-01-16"),as.Date("2020-01-17"),
as.Date("2020-01-12"),as.Date("2020-01-12"),as.Date("2020-01-11"),as.Date("2020-01-12"),as.Date("2020-01-18"),
as.Date("2020-01-20")),
"qty"=c(2,3,6,1,9,2,9,34,1,23,12,18,21,62))
我正在尝试获得以下输出:
desiredOutput <- data.frame("SKU_ID"= c(123,234,345,135,135,234),
"date1"=c(as.Date("2020-01-10"),as.Date("2020-01-05"),as.Date("2020-01-11"),as.Date("2020-01-10"),
as.Date("2020-01-15"),as.Date("2020-01-15")),
"date2"=c(as.Date("2020-01-20"),as.Date("2020-01-08"),as.Date("2020-01-15"),as.Date("2020-01-12"),
as.Date("2020-01-20"),as.Date("2020-01-25")),
"qty"=c(9,10,24,30,83,43))
我尝试了sqldf,dplyr和data.table解决方案,但没有什么能给我我想要的结果。
有什么建议吗?
答案 0 :(得分:1)
这是dplyr解决方案。
skus %>%
inner_join(transactions, by=c("SKU_ID"), suffix = c(".a", ".b")) %>%
filter(date1 <= date & date2 >= date) %>%
group_by(SKU_ID, date1, date2) %>%
summarise(qty = sum(qty)) %>%
ungroup()
# A tibble: 6 x 4
SKU_ID date1 date2 qty
<dbl> <date> <date> <dbl>
1 123 2020-01-10 2020-01-20 9
2 135 2020-01-10 2020-01-12 30
3 135 2020-01-15 2020-01-20 83
4 234 2020-01-05 2020-01-08 10
5 234 2020-01-15 2020-01-25 43
6 345 2020-01-11 2020-01-15 24
答案 1 :(得分:1)
或在data.table
中:
library(data.table)
setDT(skus)[, qty :=
setDT(transactions)[.SD, on=.(SKU_ID, date>=date1, date<=date2),
by=.EACHI, sum(qty)]$V1
]