使用dplyr和tbrf滚动几何方法

时间:2018-09-28 15:34:35

标签: r dplyr

我想使用dplyr和tbrf为每个“类型”组计算90天的滚动几何平均值和90%。下面的代码为每个日期而不是每90天生成百分位。它还错误地生成重复行。

旁注:我首先尝试使用%within%并在lubridate包中创建一个间隔。但是,dplyr目前不支持lubridate的Interval类,因此想尝试tbrf。我也尝试过tibbletime,RcppRoll和Zoo的Rollapply

##sample data###

Value=c(50,900,25,25,125,50,25,25,2000,25,25,
25,25,25,25,25,25,325,25,300,475,25)
Dates = as.Date(c("2015-02-23","2015-04-20","2015-06-17",
"2015-08-20","2015-10-05","2015-12-22",
"2016-01-19","2016-03-29","2016-05-03",
"2016-07-21","2016-09-08","2016-11-07",
"2017-02-27","2017-04-19","2017-06-29",
"2017-08-24","2017-10-23","2017-12-28",
"2018-01-16","2018-03-14","2018-05-29",
"2018-07-24"))
Type = c(rep("A", 11), rep("B", 11))

df=data.frame(Value,Dates,Type)



######failed attempt 1####
df2=df %>% group_by(Type) %>% 
 tbr_gmean(Value, Dates, "days", 90) %>%
  tbr_misc(Value, Dates,"days",  90, quantile, .(0.9)) 

 ## failed attempt #2##
 start.date = min(df$Dates)
 breaks = seq(start.date - 30*3600*24, start.date + 30*3600*24, "90 days")
 df$group = cut(df$Dates, breaks=breaks)
 DF= df %>% group_by(Type,group) %>% 
 mutate(Count=n(),gm=geoMean(Value),
 percentile_90=quantile(Value,0.90))

1 个答案:

答案 0 :(得分:2)

编辑:请尝试以下操作:

library(psych)
library(dplyr)
library(zoo)
dfmod<-df %>% 
    group_by(Type) %>% 
      arrange(Dates) %>%
        mutate(rnk = cumsum(c(TRUE, diff(Dates) > 5)))%>% #changed it from !=1 to reflect that you want the date difference to be within 5 days or less
          group_by(Type,rnk) %>% 
            mutate(GM = rollapply(Value, 2, geometric.mean, fill=NA, align="right"),
                    qt=rollapply(Value, 2, quantile, p=0.90, fill=NA, align="right")) #changed 5 to 2 so that the rolling sum is calculated for every 2 rows 
 head(dfmod)
## A tibble: 6 x 6
## Groups:   Type, rnk [1]
#  Dates      Type  Value   rnk    GM    qt
#  <date>     <fct> <dbl> <int> <dbl> <dbl>
#1 2018-10-03 A      35.3     1  NA    NA  
#2 2018-10-04 A      34.3     1  NA    NA  
#3 2018-10-05 A      34.6     1  NA    NA  
#4 2018-10-06 A      34.3     1  NA    NA  
#5 2018-10-07 A      34.1     1  34.5  35.1
#6 2018-10-08 A      34.7     1  34.4  34.6