在面板数据不均衡的情况下计算年化收益

时间:2018-09-12 15:49:44

标签: r

我有以下面板数据:

         id   date     returns  
         1    Jan 09 -0.07142857 
         1    Feb 09 -0.09615385 
         1    Mrz 09  0.03273322  
         1    Apr 09  0.14896989  
         1    May 09  0.06620690  
         1    Jun 09 -0.01811125 
         1    Jul 09 -0.07142857 
         1    Aug 09 -0.09615385 
         1    Sep 09  0.03273322  
         1    Oct 09  0.14896989  
         1    Nov 09  0.06620690  
         1    Dez 09  -0.01811125 

         2    Aug 09 -0.09615385 
         2    Sep 09  0.03273322  
         2    Oct 09  0.14896989  
         2    Nov 09  0.06620690  
         2    Dez 09 -0.01811125 

我想得到的是一个新列,其中包含每年各个ID的年度回报。如果公司在一年中没有完整的12个回报,例如本例中的id 2,则年度回​​报应基于可用的月份:例如RETannual = prod(1 + RETmonthly)^(1/5)

输出应如下所示:

         id   date     returns     RETan
         1    Jan 09 -0.07142857 
         1    Feb 09 -0.09615385 
         1    Mrz 09  0.03273322  
         1    Apr 09  0.14896989  
         1    May 09  0.06620690  
         1    Jun 09 -0.01811125 
         1    Jul 09 -0.07142857 
         1    Aug 09 -0.09615385 
         1    Sep 09  0.03273322  
         1    Oct 09  0.14896989  
         1    Nov 09  0.06620690  
         1    Dez 09  -0.01811125  0.00697433


         2    Aug 09 -0.09615385 
         2    Sep 09  0.03273322  
         2    Oct 09  0.14896989  
         2    Nov 09  0.06620690  
         2    Dez 09 -0.01811125   0.023432056

2 个答案:

答案 0 :(得分:1)

我们可以按操作分组

library(tidyverse)
library(zoo)
df1 %>% 
 group_by(id, year = year(as.yearmon(date, format = "%b %y"))) %>% 
 mutate(RETan =prod(1+returns)^(1/n()),
        RETan = replace(RETan, row_number() < n(), NA_real_))

答案 1 :(得分:1)

使用data.table可以尝试

df<- read.table(stringsAsFactors = FALSE, header = TRUE, text ="id   date     returns  
1    Jan-09 -0.07142857 
1    Feb-09 -0.09615385 
1    Mrz-09  0.03273322  
1    Apr-09  0.14896989  
1    May-09  0.06620690  
1    Jun-09 -0.01811125 
1    Jul-09 -0.07142857 
1    Aug-09 -0.09615385 
1    Sep-09  0.03273322  
1    Oct-09  0.14896989  
1    Nov-09  0.06620690  
1    Dez-09  -0.01811125 
2    Aug-09 -0.09615385 
2    Sep-09  0.03273322  
2    Oct-09  0.14896989  
2    Nov-09  0.06620690  
2    Dez-09 -0.01811125")

library(data.table)
setDT(df)[, .(RETan = prod(1+returns)^(1/.N)), by = id]

#returns
   id    RETan
1:  1 1.006974
2:  2 1.023432

当然,我没有获得与您相同的格式,您可以尝试这样做:

setDT(df)[, .(date = date, RETan = c(rep(NA,.N-1),prod(1+returns)^(1/.N))), by = id]

#returns
    id   date    RETan
 1:  1 Jan-09       NA
 2:  1 Feb-09       NA
 3:  1 Mrz-09       NA
 4:  1 Apr-09       NA
 5:  1 May-09       NA
 6:  1 Jun-09       NA
 7:  1 Jul-09       NA
 8:  1 Aug-09       NA
 9:  1 Sep-09       NA
10:  1 Oct-09       NA
11:  1 Nov-09       NA
12:  1 Dez-09 1.006974
13:  2 Aug-09       NA
14:  2 Sep-09       NA
15:  2 Oct-09       NA
16:  2 Nov-09       NA
17:  2 Dez-09 1.023432