我在R中有2个数据框:
我想在tvNationalSale中添加一个计算列,其中包含广告展示前5分钟内的会话总和。我使用dplyr包进行基本格式化。
> glimpse(tvNationalSale)
Observations: 1443
Variables:
$ Sort.Date (fctr) 5/8/2015, 5/8/2015, 5/8/2015, 5/8/2015, 5/8/2015, 5/8/2015, 5/8/2015, 5/8...
$ Before.Time (time) 2015-08-05 06:03:00, 2015-08-05 21:12:00, 2015-08-05 08:49:00, 2015-08-05...
$ Ad.Time (time) 2015-08-05 06:08:00, 2015-08-05 21:17:00, 2015-08-05 08:54:00, 2015-08-05...
$ After.Time (time) 2015-08-05 06:13:00, 2015-08-05 21:22:00, 2015-08-05 08:59:00, 2015-08-05...
$ Market.Long.Desc (fctr) National, National, National, National, National, National, National, Nat...
$ Campaign.Name (fctr) europe-sale, europe-sale, europe-sale, europe-sale, europe-sale, europe-s...
> glimpse(workingNational)
Observations: 44616
Variables:
$ date (date) 2015-05-01, 2015-05-01, 2015-05-01, 2015-05-01, 2015-05-01, 2015-05-01, 2015-05-0...
$ hour (fctr) 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
$ minute (fctr) 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,...
$ sessions (dbl) 161, 71, 65, 58, 63, 58, 56, 41, 56, 45, 58, 57, 37, 48, 37, 41, 43, 44, 36, 38, 4...
$ time (chr) "01:01:00", "01:02:00", "01:03:00", "01:04:00", "01:05:00", "01:06:00", "01:07:00"...
$ datetime (time) 2015-05-01 01:01:00, 2015-05-01 01:02:00, 2015-05-01 01:03:00, 2015-05-01 01:04:0...
This example显示了如何在一个数据框中计算周期指标,但我无法弄清楚如何从单独的数据框计算类似指标。
我尝试了这段代码,我觉得这样做没有用,因为我试图在mutate()命令中引用一个单独的数据框。
tvNationalSale <- tvNationalSale %>%
mutate(Before.Sessions=sum(filter(workingNational, datetime>=tvNationalSale$Before.Time & datetime<=tvNationalSale$Ad.Time)$sessions))
有关如何从其他数据框添加计算指标的任何想法?
答案 0 :(得分:0)
假设您的workingNational
数据没有差距或其他不正常现象,您可以在workingNational
中查找每个广告时间的位置,然后只记录导致该时间的五个条目:< / p>
indices <- match(tvNationalSale$Ad.Time, workingNational$datetime)
tvNationalSale$fiveMinutesBefore <- rowSums(sapply(1:5, function(x) workingNational$sessions[indices-x]))
head(tvNationalSale)
# Ad.Time fiveMinutesBefore
# 1 2015-01-03 04:02:00 3126
# 2 2015-01-05 02:57:00 2221
# 3 2015-01-04 14:53:00 4269
# 4 2015-01-07 01:17:00 1916
# 5 2015-01-06 15:37:00 2484
# 6 2015-01-03 14:23:00 3092
数据:
set.seed(144)
workingNational=data.frame(datetime=seq(from=ISOdate(2015, 1, 1), to=ISOdate(2015, 1, 8), by="min"))
workingNational$sessions <- sample(1:1000, nrow(workingNational), replace=TRUE)
tvNationalSale=data.frame(Ad.Time=sample(workingNational$datetime, 100))