基于两个分组变量的变量总和-上一年

时间:2019-08-20 03:14:37

标签: r dplyr data.table

我有一个客户清单,在过去的几个财政年度中,他们通过两种渠道(在线和离线)从他们那里获得收入。我希望有一个变量来显示每个客户上一年的总收入(在线+离线)。

样品数据如下所示,所需变量以黄色突出显示。计算显示在相邻列中。

Sample Data. Desired column highlighted in yellow. Calculation shown in adjacent column.

我尝试按CustomerID和Fin Year进行分组,计算收入总和,并使用lag()函数获取上一年度的总收入,但这没用。

df %>% group_by(CustomerID, FinYear) %>% mutate(yearly_totalRevenue = sum(Revenue)) %>% mutate(lastyear_totalRevenue = lag(yearly_totalRevenue )) %>%  ungroup() 

注意:由于数据量在10M范围内,因此将高度赞赏内存效率高的代码(最好使用data.table功能)。

谢谢。

Edit1:添加了示例数据的dput()。

structure(list(CustomerID = c("Cust1", "Cust2", "Cust3", "Cust4", 
"Cust5", "Cust1", "Cust2", "Cust3", "Cust4", "Cust5"), `Fin Year` = 
c("2010/11", 
"2011/12", "2012/13", "2013/14", "2014/15", "2010/11", "2011/12", 
"2012/13", "2013/14", "2014/15"), Channel = c("Online", "Online", 
"Online", "Online", "Online", "Offline", "Offline", "Offline", 
"Offline", "Offline"), Revenue = c(858, 733, 248, 541, 222, 316, 
412, 167, 385, 654)), row.names = c(NA, -10L), class = c("tbl_df", 
"tbl", "data.frame"))

1 个答案:

答案 0 :(得分:1)

您可以尝试:

setDT(df)[, yearly_totalRevenue := sum(Revenue), .(CustomerID, FinYear)][, 
    lastyear_totalRevenue := shift(yearly_totalRevenue), .(rowid(CustomerID))]

输出:

    CustomerID FinYear Channel Revenue yearly_totalRevenue lastyear_totalRevenue
 1:      Cust1 2010/11  Online     858                1174                    NA
 2:      Cust2 2011/12  Online     733                1145                  1174
 3:      Cust3 2012/13  Online     248                 415                  1145
 4:      Cust4 2013/14  Online     541                 926                   415
 5:      Cust5 2014/15  Online     222                 876                   926
 6:      Cust1 2010/11 Offline     316                1174                    NA
 7:      Cust2 2011/12 Offline     412                1145                  1174
 8:      Cust3 2012/13 Offline     167                 415                  1145
 9:      Cust4 2013/14 Offline     385                 926                   415
10:      Cust5 2014/15 Offline     654                 876                   926