以行方式应用函数来计算R中的百分比变化

时间:2016-11-10 10:04:40

标签: r time-series percentage data-manipulation

我在集团 - 州 - 品牌级别以长格式提供年度时间序列数据。我想应用一个函数来计算每个级别的增长率。

基本上是(currentvalue / previous value)-1

在下面找到数据的摘录:

Grp Sta Brnd     Yr      Sls
A   AL  Ben's   2012    29770
A   AL  Ben's   2013    23357
A   AL  Ben's   2014    22442
A   AL  Ben's   2015    21848
A   AL  Ben's   2016    13799
B   CA  Scott's 2012    1079
B   CA  Scott's 2013    11178
B   CA  Scott's 2014    14778
B   CA  Scott's 2015    15241
B   CA  Scott's 2016    10569
C   TX  Joey's  2012    1673
C   TX  Joey's  2013    1290
C   TX  Joey's  2014    899
C   TX  Joey's  2015    732
C   TX  Joey's  2016    294

基本上,每个独特的grp-state-brand级别为5行。

Grp Sta Brnd     Yr      Sls    Grwth
A   AL  Ben's   2012    29770   
A   AL  Ben's   2013    23357   -22%
A   AL  Ben's   2014    22442   -4%
A   AL  Ben's   2015    21848   -3%
A   AL  Ben's   2016    13799   -37%
B   CA  Scott's 2012    1079    
B   CA  Scott's 2013    11178   936%
B   CA  Scott's 2014    14778   32%
B   CA  Scott's 2015    15241   3%
B   CA  Scott's 2016    10569   -23%
C   TX  Joey's  2012    1673    
C   TX  Joey's  2013    1290    -23%
C   TX  Joey's  2014    899     -30%
C   TX  Joey's  2015    732     -19%
C   TX  Joey's  2016    294     -60%

2 个答案:

答案 0 :(得分:2)

使用base R

df$Grwth <- ave(df$Sls, df$Grp, df$Sta, df$Brnd, FUN = function(x) 
                                                   round((x/lag(x) -  1)*100))
df
#   Grp Sta   Brnd   Yr   Sls Grwth
#1    A  AL   Bens 2012 29770    NA
#2    A  AL   Bens 2013 23357   -22
#3    A  AL   Bens 2014 22442    -4
#4    A  AL   Bens 2015 21848    -3
#5    A  AL   Bens 2016 13799   -37
#6    B  CA Scotts 2012  1079    NA
#7    B  CA Scotts 2013 11178   936
#8    B  CA Scotts 2014 14778    32
#9    B  CA Scotts 2015 15241     3
#10   B  CA Scotts 2016 10569   -31
#11   C  TX  Joeys 2012  1673    NA
#12   C  TX  Joeys 2013  1290   -23
#13   C  TX  Joeys 2014   899   -30
#14   C  TX  Joeys 2015   732   -19
#15   C  TX  Joeys 2016   294   -60

答案 1 :(得分:1)

df=data.frame(Grp = c(rep("A",5),rep("B",5),rep("C",5)), Sta = c(rep("AL",5),rep("CA",5),rep("TX",5)), 
          Brnd = c(rep("Ben's",5),rep("Scott's",5),rep("Joey's",5)), 
          Yr=rep(c(2012,2013,2014,2015,2016),3), 
          Sls = c(29770,23357,22442,21848,13799,1079,11178,14778,15241,10569,1673,1290,899,732,294))


ddply(df, .(Grp,Sta,Brnd),mutate, y = sprintf("%.2f%%",c(NA,100*diff(Sls)/Sls[-length(Sls)])))


   Grp Sta    Brnd   Yr   Sls       y
1    A  AL   Ben's 2012 29770     NA%
2    A  AL   Ben's 2013 23357 -21.54%
3    A  AL   Ben's 2014 22442  -3.92%
4    A  AL   Ben's 2015 21848  -2.65%
5    A  AL   Ben's 2016 13799 -36.84%
6    B  CA Scott's 2012  1079     NA%
7    B  CA Scott's 2013 11178 935.96%
8    B  CA Scott's 2014 14778  32.21%
9    B  CA Scott's 2015 15241   3.13%
10   B  CA Scott's 2016 10569 -30.65%
11   C  TX  Joey's 2012  1673     NA%
12   C  TX  Joey's 2013  1290 -22.89%
13   C  TX  Joey's 2014   899 -30.31%
14   C  TX  Joey's 2015   732 -18.58%
15   C  TX  Joey's 2016   294 -59.84%