R数据组的多个滚动均值窗口

时间:2018-09-05 14:17:21

标签: r dplyr purrr

使用数据(FANG),我知道交易量和开盘价之间存在平滑的关系。我也知道,最预测的滚动均值的长度因股票而异。对于某些人来说很短,每天2天。对于其他10.我想为每种股票创建长度在2到10天之间的多种滚动方式。

到目前为止,我尝试了tibbletime程序包并获得了一个开始,这样我就可以计算出一个的多个滚动平均值。

library(tibbletime)
library(tidyverse)

data("FANG")

FB <- FANG %>% filter(symbol == “FB”)

meanstep <- seq(2, 10, 1)

col_names <- map_chr(meanstep, ~paste0("rollmean_", .x))

rollers <- map(meanstep, ~rollify(mean, window = .x)) %>% set_names(nm = col_names)

FB_multiroll<- bind_cols(FB, invoke_map(rollers, x = FB$volume))

但是,我似乎无法弄清楚如何在按多只股票分组时进行这项工作。

我尝试添加:

FANG_with_multiroll<- FANG %>% group_by(symbol) %>% bind_cols(FANG, invoke_map(rollers, x =FANG$volume)

但这没用。它创建了滚动方式,但不是按组。取而代之的是,无论“ symbol”如何,它仅占用整个数据帧。任何想法,将不胜感激。我设法使它起作用,我计划找到每个符号的最高相关性或rsquared。如果您也对更好的方法有想法,我很感兴趣。

1 个答案:

答案 0 :(得分:0)

OP已用标记了该问题,但是6个月后答案都没有解决该问题。

在1.12.0版(于2019年1月13日在CRAN上发布)中,软件包获得了frollmean()函数,可用于创建按组长度不同的多种滚动方式。

data(FANG, package = "tibbletime")
library(data.table)   # version 1.12.0 +
meanstep <- 2:10
FANG_with_multiroll <- as.data.table(FANG)[
  , sprintf("rollmean_%02i", meanstep) := frollmean(volume, meanstep), by = symbol][]
FANG_with_multiroll
      symbol       date   open   high     low  close   volume adjusted rollmean_02 rollmean_03
   1:     FB 2013-01-02  27.44  28.18  27.420  28.00 69846400    28.00          NA          NA
   2:     FB 2013-01-03  27.88  28.47  27.590  27.77 63140600    27.77    66493500          NA
   3:     FB 2013-01-04  28.01  28.93  27.830  28.76 72715400    28.76    67928000  68567466.7
   4:     FB 2013-01-07  28.69  29.79  28.650  29.42 83781800    29.42    78248600  73212600.0
   5:     FB 2013-01-08  29.51  29.60  28.860  29.06 45871300    29.06    64826550  67456166.7
  ---                                                                                         
4028:   GOOG 2016-12-23 790.90 792.74 787.280 789.91   623400   789.91      796250    933733.3
4029:   GOOG 2016-12-27 790.68 797.86 787.657 791.55   789100   791.55      706250    793866.7
4030:   GOOG 2016-12-28 793.70 794.23 783.200 785.05  1132700   785.05      960900    848400.0
4031:   GOOG 2016-12-29 783.33 785.93 778.920 782.79   742200   782.79      937450    888000.0
4032:   GOOG 2016-12-30 782.75 782.78 770.410 771.82  1760200   771.82     1251200   1211700.0
      rollmean_04 rollmean_05 rollmean_06 rollmean_07 rollmean_08 rollmean_09 rollmean_10
   1:          NA          NA          NA          NA          NA          NA          NA
   2:          NA          NA          NA          NA          NA          NA          NA
   3:          NA          NA          NA          NA          NA          NA          NA
   4:    72371050          NA          NA          NA          NA          NA          NA
   5:    66377275    67071100          NA          NA          NA          NA          NA
  ---                                                                                    
4028:      931575      990440   1230083.3     1286314     1333588     1420944     1488560
4029:      897575      903080    956883.3     1167086     1224163     1273089     1357760
4030:      878575      944600    941350.0      982000     1162788     1214000     1259050
4031:      821850      851300    910866.7      912900      952025     1116056     1166820
4032:     1106050     1009520   1002783.3     1032200     1018813     1041822     1180470

为了证明这适用于每个组,我们可以打印每个组的前几行(也只能打印前十列):

FANG_with_multiroll[, head(.SD, 3), .SDcols = 1:10, by = symbol]
    symbol symbol       date     open     high      low    close   volume  adjusted rollmean_02 rollmean_03
 1:     FB     FB 2013-01-02  27.4400  28.1800  27.4200  28.0000 69846400  28.00000          NA          NA
 2:     FB     FB 2013-01-03  27.8800  28.4700  27.5900  27.7700 63140600  27.77000    66493500          NA
 3:     FB     FB 2013-01-04  28.0100  28.9300  27.8300  28.7600 72715400  28.76000    67928000    68567467
 4:   AMZN   AMZN 2013-01-02 256.0800 258.1000 253.2600 257.3100  3271000 257.31000          NA          NA
 5:   AMZN   AMZN 2013-01-03 257.2700 260.8800 256.3700 258.4800  2750900 258.48001     3010950          NA
 6:   AMZN   AMZN 2013-01-04 257.5800 259.8000 256.6500 259.1500  1874200 259.14999     2312550     2632033
 7:   NFLX   NFLX 2013-01-02  95.2100  95.8100  90.6900  92.0100 19431300  13.14429          NA          NA
 8:   NFLX   NFLX 2013-01-03  91.9700  97.9200  91.5300  96.5900 27912500  13.79857    23671900          NA
 9:   NFLX   NFLX 2013-01-04  96.5400  97.7100  95.5400  95.9800 17761100  13.71143    22836800    21701633
10:   GOOG   GOOG 2013-01-02 719.4212 727.0013 716.5512 723.2512  5101500 361.26435          NA          NA
11:   GOOG   GOOG 2013-01-03 724.9313 731.9312 720.7212 723.6713  4653700 361.47415     4877600          NA
12:   GOOG   GOOG 2013-01-04 729.3412 741.4713 727.6812 737.9713  5547600 368.61701     5100650     5100933