最后一周按周计算滚动聚合 - R

时间:2016-06-23 10:37:01

标签: r

对于某些网站数据,我有一个数据框,其中包含一列日期和一列访问者。我希望根据星期几添加一些滚动聚合列。

我正在尝试按星期几汇总(平均值,中位数,总和,计数),因此第x天将与日x - 7,日x - 14分组。 。x - 7*nn是窗口中所需的周数,其中数据中的最小数据将超过7*n

例如,如果过去5个星期五的访问者访问级别为100, 110, 120, 130, 160,那么对于最近周五的中间值超过3周的输入将为130,而136.67为平均值在过去的3个星期五。

示例数据集:

structure(list(visit_date = structure(1:20, .Label = c("01-01-16", 
"01-02-16", "01-03-16", "01-04-16", "01-05-16", "01-06-16", "01-07-16", 
"01-08-16", "01-09-16", "01-10-16", "01-11-16", "01-12-16", "01-13-16", 
"01-14-16", "01-15-16", "01-16-16", "01-17-16", "01-18-16", "01-19-16", 
"01-20-16"), class = "factor"), visitors = c(114L, 158L, 153L, 
157L, 192L, 128L, 197L, 146L, 123L, 127L, 170L, 126L, 106L, 112L, 
119L, 184L, 186L, 171L, 183L, 125L)), .Names = c("visit_date", 
"visitors"), class = "data.frame", row.names = c(NA, -20L))

sum()的理想输出

newdf <- structure(list(visit_date = structure(1:20, .Label = c("01-01-16", 
"01-02-16", "01-03-16", "01-04-16", "01-05-16", "01-06-16", "01-07-16", 
"01-08-16", "01-09-16", "01-10-16", "01-11-16", "01-12-16", "01-13-16", 
"01-14-16", "01-15-16", "01-16-16", "01-17-16", "01-18-16", "01-19-16", 
"01-20-16"), class = "factor"), visitors = c(114L, 158L, 153L, 
157L, 192L, 128L, 197L, 146L, 123L, 127L, 170L, 126L, 106L, 112L, 
119L, 184L, 186L, 171L, 183L, 125L), sum_visitors = c(NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 379L, 465L, 466L, 
498L, 501L, 359L)), .Names = c("visit_date", "visitors", "sum_visitors"
), class = "data.frame", row.names = c(NA, -20L))

我查看了rollapply,但我不确定如何在数据框中按行滚动。

希望这是有道理的,先谢谢。

1 个答案:

答案 0 :(得分:0)

好的,我最好的猜测(使用while True: rand = random.random() print "RAND IS: " + str(rand) + "----- VAL IS " + str(value) if rand < value: print "found it" break 来简化分组操作):

data.table

输出:

require(data.table)
require(lubridate)
require(zoo)

dt <- data.table(visit_date = c("01-01-16", "01-02-16", "01-03-16", "01-04-16", "01-05-16", "01-06-16", "01-07-16", "01-08-16", "01-09-16", "01-10-16", "01-11-16", "01-12-16", "01-13-16", "01-14-16", "01-15-16", "01-16-16", "01-17-16", "01-18-16", "01-19-16", "01-20-16"),
                 visitors = c(114L, 158L, 153L, 157L, 192L, 128L, 197L, 146L, 123L, 127L, 170L, 126L, 106L, 112L, 119L, 184L, 186L, 171L, 183L, 125L))


dt[, visit_date := mdy(visit_date)]
dt[, week_day := weekdays(visit_date)]

n_weeks <- 2

dt[, sum_visitors := rollsum(visitors, n_weeks, align = "right", fill = NA), by = week_day]
dt[, sum_visitors_V2 := rollapply(visitors, n_weeks, sum, align = "right", fill = NA), by = week_day]