R - 给定矩阵和幂,生成包含矩阵列的所有唯一组合的多个矩阵

时间:2018-03-28 15:47:46

标签: r matrix cross-product

基于我在下面链接的相关问题(参见@Aleh解决方案):我希望在给定功率的矩阵中的列之间仅计算唯一的产品。

例如,对于N = 5,M = 3,p = 2,我们得到列(1,1),(1,2),(1,3),(2,1),(2)的乘积,2),(2,3),(3,1),(3,2),(3,3)。我想修改(@ Aleh' s)代码,仅计算列(1,1),(1,2),(1,3),(2,2),(2,3)之间的乘积,( 3,3)。但是我想为每个p阶段执行此操作。

有人可以帮我在R中完成这个吗?

非常感谢提前!

相关问题问题:R - Given a matrix and a power, produce multiple matrices containing all combinations of matrix columns

2 个答案:

答案 0 :(得分:5)

我们创建以下函数,该函数采用所选的p的所有“唯一”排列,并乘以矩阵的相关列:

fun <- function(mat,p) {
  mat <- as.data.frame(mat)
  combs <- do.call(expand.grid,rep(list(seq(ncol(mat))),p)) # all combinations including permutations of same values
  combs <- combs[!apply(combs,1,is.unsorted),]              # "unique" permutations only
  rownames(combs) <- apply(combs,1,paste,collapse="-")      # Just for display of output, we keep info of combinations in rownames
  combs <- combs[order(rownames(combs)),]                   # sort to have desired column order on output
  apply(combs,1,function(x) Reduce(`*`,mat[,x]))            # multiply the relevant columns
}

<强>实施例

N = 5
M = 3
mat1 = matrix(1:(N*M),N,M)
#      [,1] [,2] [,3]
# [1,]    1    6   11
# [2,]    2    7   12
# [3,]    3    8   13
# [4,]    4    9   14
# [5,]    5   10   15

M = 4
mat2 = matrix(1:(N*M),N,M)
#      [,1] [,2] [,3] [,4]
# [1,]    1    6   11   16
# [2,]    2    7   12   17
# [3,]    3    8   13   18
# [4,]    4    9   14   19
# [5,]    5   10   15   20

lapply(2:4,fun,mat=mat1)
# [[1]]
#      1-1 1-2 1-3 2-2 2-3 3-3
# [1,]   1   6  11  36  66 121
# [2,]   4  14  24  49  84 144
# [3,]   9  24  39  64 104 169
# [4,]  16  36  56  81 126 196
# [5,]  25  50  75 100 150 225
# 
# [[2]]
#      1-1-1 1-1-2 1-1-3 1-2-2 1-2-3 1-3-3 2-2-2 2-2-3 2-3-3 3-3-3
# [1,]     1     6    11    36    66   121   216   396   726  1331
# [2,]     8    28    48    98   168   288   343   588  1008  1728
# [3,]    27    72   117   192   312   507   512   832  1352  2197
# [4,]    64   144   224   324   504   784   729  1134  1764  2744
# [5,]   125   250   375   500   750  1125  1000  1500  2250  3375
# 
# [[3]]
#      1-1-1-1 1-1-1-2 1-1-1-3 1-1-2-2 1-1-2-3 1-1-3-3 1-2-2-2 1-2-2-3 1-2-3-3 1-3-3-3 2-2-2-2 2-2-2-3 2-2-3-3 2-3-3-3 3-3-3-3
# [1,]       1       6      11      36      66     121     216     396     726    1331    1296    2376    4356    7986   14641
# [2,]      16      56      96     196     336     576     686    1176    2016    3456    2401    4116    7056   12096   20736
# [3,]      81     216     351     576     936    1521    1536    2496    4056    6591    4096    6656   10816   17576   28561
# [4,]     256     576     896    1296    2016    3136    2916    4536    7056   10976    6561   10206   15876   24696   38416
# [5,]     625    1250    1875    2500    3750    5625    5000    7500   11250   16875   10000   15000   22500   33750   50625

fun(mat2,2)
#      1-1 1-2 1-3 1-4 2-2 2-3 2-4 3-3 3-4 4-4
# [1,]   1   6  11  16  36  66  96 121 176 256
# [2,]   4  14  24  34  49  84 119 144 204 289
# [3,]   9  24  39  54  64 104 144 169 234 324
# [4,]  16  36  56  76  81 126 171 196 266 361
# [5,]  25  50  75 100 100 150 200 225 300 400

答案 1 :(得分:3)

如果我理解正确,那么这就是你要找的:

# all combinations of p elements out of M with repetiton 
# c.f. http://www.mathsisfun.com/combinatorics/combinations-permutations.html
comb_rep <- function(p, M) {
  combn(M + p - 1, p) - 0:(p - 1)
}

# use cols from mat to form a new matrix
# take row products
col_prod <- function(cols, mat) {
  apply(mat[ ,cols], 1, prod)
}

N <- 5
M <- 3
p <- 3
mat <- matrix(1:(N*M),N,M)

col_comb <- lapply(2:p, comb_rep, M)
col_comb
#> [[1]]
#>      [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,]    1    1    1    2    2    3
#> [2,]    1    2    3    2    3    3
#> 
#> [[2]]
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,]    1    1    1    1    1    1    2    2    2     3
#> [2,]    1    1    1    2    2    3    2    2    3     3
#> [3,]    1    2    3    2    3    3    2    3    3     3

# prepend original matrix
res_mat <- list()
res_mat[[1]] <- mat
c(res_mat, 
  lapply(col_comb, function(cols) apply(cols, 2, col_prod, mat)))
#> [[1]]
#>      [,1] [,2] [,3]
#> [1,]    1    6   11
#> [2,]    2    7   12
#> [3,]    3    8   13
#> [4,]    4    9   14
#> [5,]    5   10   15
#> 
#> [[2]]
#>      [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,]    1    6   11   36   66  121
#> [2,]    4   14   24   49   84  144
#> [3,]    9   24   39   64  104  169
#> [4,]   16   36   56   81  126  196
#> [5,]   25   50   75  100  150  225
#> 
#> [[3]]
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,]    1    6   11   36   66  121  216  396  726  1331
#> [2,]    8   28   48   98  168  288  343  588 1008  1728
#> [3,]   27   72  117  192  312  507  512  832 1352  2197
#> [4,]   64  144  224  324  504  784  729 1134 1764  2744
#> [5,]  125  250  375  500  750 1125 1000 1500 2250  3375
但是,它并不是真正有效的,因为例如从原始矩阵的三列而不是原始矩阵的一列和第二功率的一列计算第三功率。

编辑使用评论中提到的实际尺寸进行测试表明,@ Moody_Mudskipper的乘法方法更快 ,而我的组合方法是快一点因此将两者结合起来是有意义的:

# original function from @Moody_Mudskipper's answer
fun <- function(mat,p) {
  mat <- as.data.frame(mat)
  combs <- do.call(expand.grid,rep(list(seq(ncol(mat))),p)) # all combinations including permutations of same values
  combs <- combs[!apply(combs,1,is.unsorted),]              # "unique" permutations only
  rownames(combs) <- apply(combs,1,paste,collapse="-")      # Just for display of output, we keep info of combinations in rownames
  combs <- combs[order(rownames(combs)),]                   # sort to have desired column order on output
  apply(combs,1,function(x) Reduce(`*`,mat[,x]))            # multiply the relevant columns
}
combined <- function(mat, p) {
  mat <- as.data.frame(mat)
  combs <- combn(ncol(mat) + p - 1, p) - 0:(p - 1)          # all combinations with repetition
  colnames(combs) <- apply(combs, 2, paste, collapse = "-") # Just for display of output, we keep info of combinations in colnames
  apply(combs, 2, function(x) Reduce(`*`, mat[ ,x]))        # multiply the relevant columns
}
N <- 10000
M <- 25
p <- 4
mat <- matrix(runif(N*M),N,M)
microbenchmark::microbenchmark(
  fun(mat, p),
  combined(mat, p),
  times = 10
)
#> Unit: seconds
#>              expr      min       lq     mean   median       uq      max neval
#>       fun(mat, p) 3.456853 3.698680 4.067995 4.032647 4.341944 4.869527    10
#>  combined(mat, p) 2.543994 2.738313 2.870446 2.793768 3.090498 3.254232    10

请注意,这两个函数不会为M > 9产生相同的结果,因为由于1-10 < 1-2中使用了fun的词法排序,列排序不同。如果在combined中插入相同的词法排序,结果将是相同的。