对于某些网站数据,我有一个数据框,其中包含一列日期和一列访问者。我希望根据星期几添加一些滚动聚合列。
我正在尝试按星期几汇总(平均值,中位数,总和,计数),因此第x
天将与日x - 7
,日x - 14
分组。 。x - 7*n
天n
是窗口中所需的周数,其中数据中的最小数据将超过7*n
。
例如,如果过去5个星期五的访问者访问级别为100, 110, 120, 130, 160
,那么对于最近周五的中间值超过3周的输入将为130
,而136.67
为平均值在过去的3个星期五。
示例数据集:
structure(list(visit_date = structure(1:20, .Label = c("01-01-16",
"01-02-16", "01-03-16", "01-04-16", "01-05-16", "01-06-16", "01-07-16",
"01-08-16", "01-09-16", "01-10-16", "01-11-16", "01-12-16", "01-13-16",
"01-14-16", "01-15-16", "01-16-16", "01-17-16", "01-18-16", "01-19-16",
"01-20-16"), class = "factor"), visitors = c(114L, 158L, 153L,
157L, 192L, 128L, 197L, 146L, 123L, 127L, 170L, 126L, 106L, 112L,
119L, 184L, 186L, 171L, 183L, 125L)), .Names = c("visit_date",
"visitors"), class = "data.frame", row.names = c(NA, -20L))
sum()的理想输出
newdf <- structure(list(visit_date = structure(1:20, .Label = c("01-01-16",
"01-02-16", "01-03-16", "01-04-16", "01-05-16", "01-06-16", "01-07-16",
"01-08-16", "01-09-16", "01-10-16", "01-11-16", "01-12-16", "01-13-16",
"01-14-16", "01-15-16", "01-16-16", "01-17-16", "01-18-16", "01-19-16",
"01-20-16"), class = "factor"), visitors = c(114L, 158L, 153L,
157L, 192L, 128L, 197L, 146L, 123L, 127L, 170L, 126L, 106L, 112L,
119L, 184L, 186L, 171L, 183L, 125L), sum_visitors = c(NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 379L, 465L, 466L,
498L, 501L, 359L)), .Names = c("visit_date", "visitors", "sum_visitors"
), class = "data.frame", row.names = c(NA, -20L))
我查看了rollapply
,但我不确定如何在数据框中按行滚动。
希望这是有道理的,先谢谢。
答案 0 :(得分:0)
好的,我最好的猜测(使用while True:
rand = random.random()
print "RAND IS: " + str(rand) + "----- VAL IS " + str(value)
if rand < value:
print "found it"
break
来简化分组操作):
data.table
输出:
require(data.table)
require(lubridate)
require(zoo)
dt <- data.table(visit_date = c("01-01-16", "01-02-16", "01-03-16", "01-04-16", "01-05-16", "01-06-16", "01-07-16", "01-08-16", "01-09-16", "01-10-16", "01-11-16", "01-12-16", "01-13-16", "01-14-16", "01-15-16", "01-16-16", "01-17-16", "01-18-16", "01-19-16", "01-20-16"),
visitors = c(114L, 158L, 153L, 157L, 192L, 128L, 197L, 146L, 123L, 127L, 170L, 126L, 106L, 112L, 119L, 184L, 186L, 171L, 183L, 125L))
dt[, visit_date := mdy(visit_date)]
dt[, week_day := weekdays(visit_date)]
n_weeks <- 2
dt[, sum_visitors := rollsum(visitors, n_weeks, align = "right", fill = NA), by = week_day]
dt[, sum_visitors_V2 := rollapply(visitors, n_weeks, sum, align = "right", fill = NA), by = week_day]