我是dplyr的新手,并试图计算产品每周3周后的平均销售额。如果我们没有所有3周的历史数据,我们将考虑有多少数据点存在。
编辑:
我很抱歉我之前错过了这个,但还有一个我们需要考虑的因素,我们应该只考虑没有促销的那些列的滞后。
以下是示例数据
df = structure(list(Product = structure(c(1L, 2L, 1L, 1L, 1L, 2L,
2L, 1L), .Label = c("A", "B"), class = "factor"), Promo = structure(c(1L,
1L, 2L, 1L, 1L, 2L, 1L, 1L), .Label = c("0", "1"), class = "factor"),
Week = structure(c(1L, 1L, 2L, 3L, 4L, 2L, 3L, 5L), .Label = c("2017-01-01",
"2017-01-02", "2017-01-03", "2017-01-04", "2017-01-05"), class = "factor"),
Sales = c(50, 50, 60, 70, 50, 50, 80, 70)), .Names = c("Product",
"Promo", "Week", "Sales"), row.names = c(NA, -8L), class = "data.frame")
Product Promo Week Sales
1 A 0 2017-01-01 50
2 B 0 2017-01-01 50
3 A 1 2017-01-02 60
4 A 0 2017-01-03 70
5 A 0 2017-01-04 50
6 B 1 2017-01-02 50
7 B 0 2017-01-03 80
8 A 0 2017-01-05 70
我想为每个产品返回3个非促销(旗帜= 0)周,所以o / p会是这样的
Product Promo Week Sales Avg_Non_Promo_Sales
1 A 0 2017-01-01 50 50
2 B 0 2017-01-01 50 50
3 A 1 2017-01-02 60 50 # We have only 1 non promo week
#before
4 A 0 2017-01-03 70 60 #70+50 /2
5 A 0 2017-01-04 50 56.6 #(50 + 70 + 50 /3, non promo)
6 B 1 2017-01-02 50 50
7 B 0 2017-01-03 80 65 #50 + 80 /2
8 A 0 2017-01-05 70 60 # 240/4