根据R的当前周数前几周的滞后来找到平均值

时间:2018-02-23 10:13:15

标签: r dplyr grouping lag summary

我是dplyr的新手,并试图计算产品每周3周后的平均销售额。如果我们没有所有3周的历史数据,我们将考虑有多少数据点存在。

编辑:

我很抱歉我之前错过了这个,但还有一个我们需要考虑的因素,我们应该只考虑没有促销的那些列的滞后。

以下是示例数据

df =   structure(list(Product = structure(c(1L, 2L, 1L, 1L, 1L, 2L, 
2L, 1L), .Label = c("A", "B"), class = "factor"), Promo = structure(c(1L, 
1L, 2L, 1L, 1L, 2L, 1L, 1L), .Label = c("0", "1"), class = "factor"), 
    Week = structure(c(1L, 1L, 2L, 3L, 4L, 2L, 3L, 5L), .Label = c("2017-01-01", 
    "2017-01-02", "2017-01-03", "2017-01-04", "2017-01-05"), class = "factor"), 
    Sales = c(50, 50, 60, 70, 50, 50, 80, 70)), .Names = c("Product", 
"Promo", "Week", "Sales"), row.names = c(NA, -8L), class = "data.frame")
  Product Promo       Week Sales
1       A     0 2017-01-01    50
2       B     0 2017-01-01    50
3       A     1 2017-01-02    60
4       A     0 2017-01-03    70
5       A     0 2017-01-04    50
6       B     1 2017-01-02    50
7       B     0 2017-01-03    80
8       A     0 2017-01-05    70

我想为每个产品返回3个非促销(旗帜= 0)周,所以o / p会是这样的

     Product Promo       Week Sales  Avg_Non_Promo_Sales
1       A     0 2017-01-01    50       50
2       B     0 2017-01-01    50       50
3       A     1 2017-01-02    60       50 # We have only 1 non promo week 
                                          #before 
4       A     0 2017-01-03    70       60 #70+50 /2 
5       A     0 2017-01-04    50       56.6 #(50 + 70 + 50 /3, non promo)
6       B     1 2017-01-02    50       50
7       B     0 2017-01-03    80       65 #50 + 80 /2
8       A     0 2017-01-05    70       60 # 240/4 

0 个答案:

没有答案