在R中将函数迭代地应用于其自身的结果

时间:2018-07-02 16:45:44

标签: r dplyr tidyverse

我有一个看起来像这样的数据集:

dat <- structure(list(year = c(2003, 2004, 2005, 2006, 2007, 2008, 2009, 
2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017), CD = c(246.74, 
271.25, 295.21, 307.46, 405.82, 391.65, 439.1, 538.39, 549.27, 
559.94, 510.51, 516.14, 480.25, 472.18, 460.56), Growth = c(1.17, 
0.94, 1.05, 0.95, 1, 1.04, 1.09, 1.08, 1, 1.08, 0.97, 0.99, 1.06, 
0.99, 0.99)), .Names = c("year", "CD", "Growth"), class = "data.frame", row.names = c(NA, 
-15L))

看起来像

   year     CD    Growth
16 2003  246.74   1.17
17 2004  271.25   0.94
18 2005  295.21   1.05
19 2006  307.46   0.95
20 2007  405.82   1.00
21 2008  391.65   1.04
22 2009  439.10   1.09
23 2010  538.39   1.08
24 2011  549.27   1.00
25 2012  559.94   1.08
26 2013  510.51   0.97
27 2014  516.14   0.99
28 2015  480.25   1.06
29 2016  472.18   0.99
30 2017  460.56   0.99

我需要做的是创建一个新列,调用为KD,它具有以下值:

  1. 对于2007年,CD

  2. 在2007年之后的所有年份,KD of the year before * Growth of the current year

  3. 在2007年之前的所有年份,KD of the following year / Growth of the current year

换句话说,2007年为参考年,KD[year == 2007]应为 405.82 KD[year == 2008]应为 422.05 ({{1 }})和405.82 * 1.04应该是 460.04 KD[year == 2009]

同时,422.05 * 1.09应该是 427.18 KD[year == 2006]),而405.82 / 0.95 406.84 KD[year == 2005])< / p>

是否有一种简单的方法可以在R中执行此操作而不使用繁琐的for循环?

2 个答案:

答案 0 :(得分:1)

我们可以做这样的事情:

library(dplyr)

df %>%
  mutate(KD_ref = CD[year == 2007],
         Growth_cumdiv = c(rev(cumprod(rev(1/Growth[year < 2007]))), 
                           rep(NA, sum(year >= 2007))),
         Growth_cumprod = c(rep(NA, sum(year <= 2007)), 
                            cumprod(Growth[year > 2007])),
         KD = case_when(
    year < 2007 ~ KD_ref*Growth_cumdiv
    year == 2007 ~ KD_ref,
    year > 2007 ~ KD_ref*Growth_cumprod,
  ))

结果:

   year     CD Growth KD_ref Growth_cumdiv Growth_cumprod       KD
1  2003 246.74   1.17 405.82     0.9115351             NA 369.9192
2  2004 271.25   0.94 405.82     1.0664960             NA 432.8054
3  2005 295.21   1.05 405.82     1.0025063             NA 406.8371
4  2006 307.46   0.95 405.82     1.0526316             NA 427.1789
5  2007 405.82   1.00 405.82            NA             NA 405.8200
6  2008 391.65   1.04 405.82            NA       1.040000 422.0528
7  2009 439.10   1.09 405.82            NA       1.133600 460.0376
8  2010 538.39   1.08 405.82            NA       1.224288 496.8406
9  2011 549.27   1.00 405.82            NA       1.224288 496.8406
10 2012 559.94   1.08 405.82            NA       1.322231 536.5878
11 2013 510.51   0.97 405.82            NA       1.282564 520.4902
12 2014 516.14   0.99 405.82            NA       1.269738 515.2853
13 2015 480.25   1.06 405.82            NA       1.345923 546.2024
14 2016 472.18   0.99 405.82            NA       1.332464 540.7404
15 2017 460.56   0.99 405.82            NA       1.319139 535.3330

还可以使其具有功能:

library(dplyr)
library(rlang)

KD_calc <- function(DF, ref_year, KD_colname){
  KD_colname_quo = quo_name(enquo(KD_colname))
  DF %>%
    mutate(KD_ref = CD[year == ref_year],
           Growth_cumdiv = c(rev(cumprod(rev(1/Growth[year < ref_year]))), 
                             rep(NA, sum(year >= ref_year))),
           Growth_cumprod = c(rep(NA, sum(year <= ref_year)), 
                              cumprod(Growth[year > ref_year])),
           UQ(KD_colname_quo) := case_when(
             year < ref_year ~ KD_ref*Growth_cumdiv,
             year == ref_year ~ KD_ref,
             year > ref_year ~ KD_ref*Growth_cumprod,
           )) %>%
    select(-KD_ref, -Growth_cumdiv, -Growth_cumprod)
}

结果:

> KD_calc(df, 2007, KD)
   year     CD Growth       KD
1  2003 246.74   1.17 369.9192
2  2004 271.25   0.94 432.8054
3  2005 295.21   1.05 406.8371
4  2006 307.46   0.95 427.1789
5  2007 405.82   1.00 405.8200
6  2008 391.65   1.04 422.0528
7  2009 439.10   1.09 460.0376
8  2010 538.39   1.08 496.8406
9  2011 549.27   1.00 496.8406
10 2012 559.94   1.08 536.5878
11 2013 510.51   0.97 520.4902
12 2014 516.14   0.99 515.2853
13 2015 480.25   1.06 546.2024
14 2016 472.18   0.99 540.7404
15 2017 460.56   0.99 535.3330

答案 1 :(得分:1)

dat%>%mutate(l=CD[year==2007])%>%
  group_by(s=cumsum(year==2007))%>%
  mutate(KD=ifelse(s==0,l/rev(cumprod(rev(Growth))),l*cumprod(Growth)),l=NULL)%>%
  data.frame()


   year     CD Growth s       KD
1  2003 246.74   1.17 0 369.9192
2  2004 271.25   0.94 0 432.8054
3  2005 295.21   1.05 0 406.8371
4  2006 307.46   0.95 0 427.1789
5  2007 405.82   1.00 1 405.8200
6  2008 391.65   1.04 1 422.0528
7  2009 439.10   1.09 1 460.0376
8  2010 538.39   1.08 1 496.8406
9  2011 549.27   1.00 1 496.8406
10 2012 559.94   1.08 1 536.5878
11 2013 510.51   0.97 1 520.4902
12 2014 516.14   0.99 1 515.2853
13 2015 480.25   1.06 1 546.2024
14 2016 472.18   0.99 1 540.7404
15 2017 460.56   0.99 1 535.3330