展望未来

时间:2019-02-22 11:40:52

标签: r dplyr data-manipulation

我有一个data.frame,其中包含三列Year,Nominal_Revenue和COEFFICIENT。所以我想用下面的例子来预测这些数据

        library(dplyr)
    TEST<-data.frame(
      Year= c(2000,2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014,2015,2016,2017,2018,2019,2020,2021),
      Nominal_Revenue=c(8634,5798,6011,6002,6166,6478,6731,7114,6956,6968,7098,7610,7642,8203,9856,10328,11364,12211,13150,NA,NA,NA),
      COEFFICIENT=c(NA,1.016,1.026,1.042,1.049,1.106,1.092,1.123,1.121,0.999,1.059,1.066,1.006,1.081,1.055,1.063,1.071,1.04,1.072,1.062,1.07,   1.075))

SIMULATION<-mutate(TEST,
                        FORECAST=lag(Nominal_Revenue)*COEFFICIENT
                        )

此代码的结果如下图所示,换句话说,此代码仅计算一年或更精确的2019年预测。 enter image description here

所以我的意图是仅在Nominal_Revenue列中获得NA的结果,如下图所示。 enter image description here

那么有人可以帮助我解决此代码吗?

1 个答案:

答案 0 :(得分:3)

由于每次您需要以前计算出的值,我们都可以循环使用变量中的NA数量并应用dplyr

for (i in 1:length(which(is.na(TEST$Nominal_Revenue)))){
TEST=TEST%>%mutate(Nominal_Revenue=if_else(is.na(Nominal_Revenue),COEFFICIENT*lag(Nominal_Revenue),Nominal_Revenue))
}

> TEST
   Year Nominal_Revenue COEFFICIENT
1  2000         8634.00          NA
2  2001         5798.00       1.016
3  2002         6011.00       1.026
4  2003         6002.00       1.042
5  2004         6166.00       1.049
6  2005         6478.00       1.106
7  2006         6731.00       1.092
8  2007         7114.00       1.123
9  2008         6956.00       1.121
10 2009         6968.00       0.999
11 2010         7098.00       1.059
12 2011         7610.00       1.066
13 2012         7642.00       1.006
14 2013         8203.00       1.081
15 2014         9856.00       1.055
16 2015        10328.00       1.063
17 2016        11364.00       1.071
18 2017        12211.00       1.040
19 2018        13150.00       1.072
20 2019        13965.30       1.062
21 2020        14942.87       1.070
22 2021        16063.59       1.075