在R中使嵌套的foreach循环更有效?

时间:2015-09-17 18:11:40

标签: r matrix foreach parallel-processing dataframe

我编写了一个带有3个嵌套foreach循环的函数,并行运行。该函数的目标是将30 [10,5]个矩阵(即[[30]][10,5])的列表拆分为5 [10,30]个矩阵的列表(即[[5]][10,30])。

但是,我尝试使用1,000,000个路径(即foreach (m = 1:1000000))运行此功能,显然,性能非常糟糕。

如果可能的话,我想避免应用功能,因为我发现当它们与并行的foreach循环一起使用时它们不能正常工作:

library(foreach)
library(doParallel)

# input matr: a list of 30 [10,5] matrices
matrix_splitter <- function(matr) {
  time_horizon <- 30
  paths <- 10
  asset <- 5

  security_paths <- foreach(i = 1:asset, .combine = rbind, .packages = "doParallel", .export = "daily") %dopar% {
    foreach(m = 1:paths, .combine = rbind, .packages = "doParallel", .export = "daily") %dopar% {
      foreach(p = daily, .combine = c) %dopar% {
        p[m,i]  
      }
    }
  }
  df_securities <- as.data.frame(security_paths)
  split(df_securities, sample(rep(1:paths), asset))
}

总的来说,我试图转换这种数据格式:

[[30]]
            [,1]        [,2]       [,3]       [,4]        [,5]
 [1,]  0.2800977  2.06715521  0.9196326  0.3560659  1.36126507
 [2,] -0.5119867  0.24329025  0.1513218 -1.2528092 -0.04795098
 [3,] -2.0293933 -1.17989270  0.3053376 -0.9528611  0.86758140
 [4,] -0.6419024 -0.24846720 -0.6640066 -1.7104961 -0.32759406
 [5,] -0.4340359 -0.44034013  3.3440507  0.7380613  2.01237069
 [6,] -0.6679914 -0.01332117  1.9286056 -0.7194116  0.15549978
 [7,]  0.5919820  0.11616685 -0.8424634 -0.7652715  1.34176688
 [8,]  0.8079152  0.40592119 -0.4291811  0.9358829 -0.97479314
 [9,] -0.0265207 -0.03598320  1.1287344  0.4732984  1.37792596
[10,]  1.0553966  0.65776721 -1.2833613 -0.2414846  0.81528686

采用这种格式(显然高达V30):

$`5`
V1         V2          V3         V4         V5         V6         V7
result.2   -0.11822260  1.7712833  1.97737285 -1.6643193  0.4788075  1.2394064  1.4800787
result.7   -1.23251178  0.4267885 -0.07728632  0.3463092  0.8766395  0.6324840  0.5946710
result.2.1 -1.27309457 -0.3128173 -0.79561297 -0.4713307 -0.4344864  0.4688124 -0.5646857
result.7.1  0.51702719 -1.6242650 -2.37976199 -0.1088408  0.4846507 -0.7594376  0.9326529
result.2.2  1.77550390  0.9279155  0.26168402  0.4893835  1.4131326  0.5989508 -0.3434010
result.7.2 -0.01590682 -0.5568578  1.35789122 -0.1385092 -0.4501515 -0.2581724  0.5451699
result.2.3  0.30400225 -1.0245640 -0.05285694 -0.1354228  0.3070331 -0.7618850  1.0330961
result.7.3 -0.08139912  0.4106541  1.40418839  0.2471505  1.2106539  1.3844721  0.4006751
result.2.4  0.94977544 -0.8045054  1.48791211  1.4361686 -0.3789274 -1.9570125 -1.6576634
result.7.4  0.70449194  1.6887800  0.56447340  0.6465640  2.6865388 -0.7367524  0.6242624
                     V8         V9         V10         V11        V12         V13
result.2   -0.432404728 -1.6225350  0.09855465  0.17371907  0.3081843  0.15148452
result.7   -0.597420706  0.6173004  0.07518596  2.01741406  0.1767152 -0.39219471
result.2.1  0.918408322 -1.6896424 -0.13409626  0.38674224  0.3491750 -1.61083286
result.7.1  2.564057340 -0.7696399  1.06103614  1.38528367  1.1684045 -0.08467871
result.2.2  0.951995816  0.1910284  1.79943500  2.13909498  0.2847664  0.31094568
result.7.2 -0.479349220 -0.2368760  0.04298525 -0.40385960  0.3986555 -1.93499213
result.2.3 -1.382370069  1.0459845 -0.33106323 -0.43362925  0.7045572 -0.30211601
result.7.3 -1.457106442  0.1487447 -2.52392942 -0.02399523 -1.0349746  0.87666365
result.2.4 -0.848879365  0.7521024  0.16790915  0.47112444  0.8886361 -0.12733039
result.7.4 -0.003350467  0.4021858 -1.80031445 -1.42399232  1.0507765 -0.36193846

3 个答案:

答案 0 :(得分:1)

plyr包就是针对此问题设计的。{1}}。我们的想法是:将列表取消列表,以适当的方式将其填充到数组中,然后使用alply将此数组转换为矩阵列表。

alply矩阵2列表转换为3x5矩阵5列表的示例:

2x3

答案 1 :(得分:0)

我相信您正在寻找答案: Function to split a matrix into sub-matrices in R

您只需使用do.call(rbind, matlist)作为这些功能的输入。

答案 2 :(得分:0)

我会将你的所有列表转换为一个很棒的大矢量,然后重新标注它。

对于我的解决方案,我开始:

[[28]]
        [,1] [,2] [,3] [,4] [,5]
  [1,]    1   11   21   31   41
  [2,]    2   12   22   32   42
  [3,]    3   13   23   33   43
  [4,]    4   14   24   34   44
  [5,]    5   15   25   35   45
  [6,]    6   16   26   36   46
  [7,]    7   17   27   37   47
  [8,]    8   18   28   38   48
  [9,]    9   19   29   39   49
 [10,]   10   20   30   40   50

重复三十次。这是变量orig。我的代码:

flattened.vec <- unlist(orig)  #flatten the list of matrices into one big vector
dim(flattened.vec) <-c(10,150) #need to rearrange the vector so the re-shape comes out right
transposed.matrix <- t(flattened.vec) #transposing to make sure right elements go to the right place
new.matrix.list <- split(transposed.matrix,cut(seq_along(transposed.matrix)%%5, 10, labels = FALSE))  #split the big, transposed matrix into 5 10x30 matrices

此代码为您提供了5个向量,您需要dim(10,30),然后在foreach中使用t()来获取5个30X10向量(我通常会使用apply函数,我不熟悉foreach库。

执行此操作后,得到5个矩阵之一的最终结果:

      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17]
 [1,]    1    1    1    1    1    1    1    1    1     1     1     1     1     1     1     1     1
 [2,]    2    2    2    2    2    2    2    2    2     2     2     2     2     2     2     2     2
 [3,]    3    3    3    3    3    3    3    3    3     3     3     3     3     3     3     3     3
 [4,]    4    4    4    4    4    4    4    4    4     4     4     4     4     4     4     4     4
 [5,]    5    5    5    5    5    5    5    5    5     5     5     5     5     5     5     5     5
 [6,]    6    6    6    6    6    6    6    6    6     6     6     6     6     6     6     6     6
 [7,]    7    7    7    7    7    7    7    7    7     7     7     7     7     7     7     7     7
 [8,]    8    8    8    8    8    8    8    8    8     8     8     8     8     8     8     8     8
 [9,]    9    9    9    9    9    9    9    9    9     9     9     9     9     9     9     9     9
[10,]   10   10   10   10   10   10   10   10   10    10    10    10    10    10    10    10    10

        [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27] [,28] [,29] [,30]
 [1,]     1     1     1     1     1     1     1     1     1     1     1     1     1
 [2,]     2     2     2     2     2     2     2     2     2     2     2     2     2
 [3,]     3     3     3     3     3     3     3     3     3     3     3     3     3
 [4,]     4     4     4     4     4     4     4     4     4     4     4     4     4
 [5,]     5     5     5     5     5     5     5     5     5     5     5     5     5
 [6,]     6     6     6     6     6     6     6     6     6     6     6     6     6
 [7,]     7     7     7     7     7     7     7     7     7     7     7     7     7
 [8,]     8     8     8     8     8     8     8     8     8     8     8     8     8
 [9,]     9     9     9     9     9     9     9     9     9     9     9     9     9
[10,]    10    10    10    10    10    10    10    10    10    10    10    10    10

顺便说一句,这可能就是plyr软件包本身就已经完成了(由Beauvel上校发布),只是手动而不是使用外部库