我有ExpandedGrid 11760 obs of 4 variables
:
Date - date format
Device - factor
Creative - factor
Partner - factor
我还有MediaPlanDF 215 obs of 6 variables
:
Interval - an interval of dates I created using lubridate
Partner - factor
Device - factor
Creative - factor
Daily Spend - num
Daily Impressions - num
这是我的麻烦。
我需要根据以下两个条件,在MediaPlanDF的相应列中汇总每日支出和每日展示次数:
标准1
- ExpandedGrid$Device matches MediaPlanDF$Device
- ExpandedGrid$Creative matches MediaPlanDF$Creative
- ExpandedGrid$Partner matches MediaPlanDF$Partner
标准2
- ExpandedGrid$Date falls within MediaPlanDF$Interval
现在我可以针对每个标准自行解决这个问题,但是我最难将它们放在一起而不会出错,而且我对答案的搜索并没有取得很大的成功(很多很好的例子,但是没有什么我有能力适应我的背景)。我已经尝试了各种方法,但我的思维开始走向过于复杂的解决方案,我需要帮助。
我尝试过这样的索引:
indexb <- as.character(ExpandedGrid$Device) == as.character(MediaPlanDF$Device);
indexc <- as.character(ExpandedGrid$Creative) == as.character(MediaPlanDF$Creative);
indexd <- as.character(ExpandedGrid$Partner) == as.character(MediaPlanDF$Partner);
index <- ExpandedGrid$Date %within% MediaPlanDF$Interval;
KEYDF <- data.frame(index, indexb, indexc, indexd)
KEYDF$Key <- apply(KEYDF, 1, function(x)(all(x) || all(!x)))
KEYDF$Key.cha <- as.character(KEYDF$Key)
outputbydim <- do.call(rbind, lapply(KEYDF$Key.cha, function(x){
index <- x == "TRUE";
list(impressions = sum(MediaPlanDF$Daily.Impressions[index]),
spend = sum(MediaPlanDF$Daily.Spend[index]))}))
不幸的是,这会排除正确求和的值,但是那些真值的总和值是不正确的。
以下是数据摘录:
ExpandedGrid:
Date Device Creative Partner
2015-08-31 "Desktop" "Standard" "ACCUEN"
MediaPlanDF
Interval Device Creative Partner Daily Spend Daily Impressions
2015-08-30 17:00:00 PDT--2015-10-03 17:00:00 PDT "Desktop" "Standard" "ACCUEN" 1696.27 1000339.17
有谁知道从哪里去?
提前致谢!