我需要创建维度5(5x5)的所有可能矩阵,其中所有元素都是从0到100的整数,其总和为100.
我不知道该怎么做,或者怎么开始...有什么建议吗?
尽管我在R中编程,但我正在寻找如何做到这一点的想法。 Pseucode很好。
我的第一种方法是将100个元素的所有排列25次(矩阵中的每个元素一个),然后只取100和100的排列。但这是100 ^ 25个排列......没办法在通过这种方法。
我会感谢任何想法和/或帮助!
答案 0 :(得分:5)
OP正在寻找最大长度为25的数字100的所有整数分区。包partitions
配备了一个专门用于此目的的函数,称为restrictedparts
。 E.g:
library(partitions)
## all integer partitions of 10 of maximal length = 4
restrictedparts(10, 4)
[1,] 10 9 8 7 6 5 8 7 6 5 6 5 4 4 7 6 5 4 5 4 3 4 3
[2,] 0 1 2 3 4 5 1 2 3 4 2 3 4 3 1 2 3 4 2 3 3 2 3
[3,] 0 0 0 0 0 0 1 1 1 1 2 2 2 3 1 1 1 1 2 2 3 2 2
[4,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 2 2
生成所有这些组合后,只需为每个组合创建一个5x5矩阵(restrictedparts
不区分0 0 3
和0 3 0
)。唯一的问题是,有这么多可能的组合(partitions::R(25, 100, TRUE) = 139620591
),当您调用restrictedparts(100, 25)
时,该函数会抛出错误。
test <- restrictedparts(100, 25)
restrictedparts中的错误(100,25):外部函数调用中的NA(arg 3) 另外:警告信息: 在restrictedparts(100,25)中:通过强制引入的NA到整数范围
由于我们无法通过restrictedparts
生成所有内容,因此我们可以使用firstrestrictedpart
和nextrestrictedpart
单独生成它们,如下所示:
funPartition <- function(n) {
p <- firstrestrictedpart(100, 25)
mat <- matrix(nrow = 25, ncol = n)
mat[,1] <- p
for (i in 2:n) {
p <- nextrestrictedpart(p)
mat[,i] <- p
}
mat
}
head(funPartition(5))
[,1] [,2] [,3] [,4] [,5]
[1,] 100 99 98 97 96
[2,] 0 1 2 3 4
[3,] 0 0 0 0 0
[4,] 0 0 0 0 0
[5,] 0 0 0 0 0
[6,] 0 0 0 0 0
唯一的问题是效率不高。
输入RcppAlgos
使用包RcppAlgos
(我是其作者)的方法更快。
library(RcppAlgos)
combs <- comboGeneral(0:100,25,TRUE,"sum","==",100,rowCap=10^5)
matrixCombs <- lapply(1:nrow(combs), function(x) matrix(combs[x,], nrow = 5, ncol = 5))
matrixCombs[1:3]
[[1]]
[,1] [,2] [,3] [,4] [,5]
[1,] 0 0 0 0 0
[2,] 0 0 0 0 0
[3,] 0 0 0 0 0
[4,] 0 0 0 0 0
[5,] 0 0 0 0 100
[[2]]
[,1] [,2] [,3] [,4] [,5]
[1,] 0 0 0 0 0
[2,] 0 0 0 0 0
[3,] 0 0 0 0 0
[4,] 0 0 0 0 1
[5,] 0 0 0 0 99
[[3]]
[,1] [,2] [,3] [,4] [,5]
[1,] 0 0 0 0 0
[2,] 0 0 0 0 0
[3,] 0 0 0 0 0
[4,] 0 0 0 0 2
[5,] 0 0 0 0 98
如果你真的想要排列,没问题,只需致电permuteGeneral
:
perms <- permuteGeneral(0:100,25,TRUE,"sum","==",100,rowCap=10^5)
matrixPerms <- lapply(1:nrow(perms), function(x) matrix(perms[x,], nrow = 5, ncol = 5))
matrixPerms[1:3]
[[1]]
[,1] [,2] [,3] [,4] [,5]
[1,] 0 0 0 0 0
[2,] 0 0 0 0 0
[3,] 0 0 0 0 0
[4,] 0 0 0 0 0
[5,] 0 0 0 0 100
[[2]]
[,1] [,2] [,3] [,4] [,5]
[1,] 0 0 0 0 0
[2,] 0 0 0 0 0
[3,] 0 0 0 0 0
[4,] 0 0 0 0 100
[5,] 0 0 0 0 0
[[3]]
[,1] [,2] [,3] [,4] [,5]
[1,] 0 0 0 0 0
[2,] 0 0 0 0 0
[3,] 0 0 0 0 100
[4,] 0 0 0 0 0
[5,] 0 0 0 0 0
它也非常快。让norm100Master
成为norm100
和lapply(rep(5, runs), norm100)
的包装。
funRcppAlgos <- function(myCap) {
perms <- permuteGeneral(0:100,25,TRUE,"sum","==",100,rowCap=myCap)
lapply(1:myCap, function(x) matrix(perms[x,], nrow = 5, ncol = 5))
}
runs <- 5000
microbenchmark(norm100Master(runs), funRcppAlgos(runs))
Unit: milliseconds
expr min lq mean median uq max neval
norm100Master(runs) 50.930848 56.413103 65.00415 57.341665 64.242075 125.5940 100
funRcppAlgos(runs) 8.711444 9.382808 13.05653 9.555321 9.912229 116.9166 100
将唯一一代整数分区与上面的funPartition
进行比较(不转换为矩阵),我们有:
microbenchmark(nextPs = funPartition(10^4),
algos = comboGeneral(0:100,25,TRUE,"sum","==",100,10^4))
Unit: milliseconds
expr min lq mean median uq max neval
nextPs 317.778757 334.35560 351.68058 343.81085 355.03575 521.13181 100
algos 9.438661 10.12685 10.60887 10.37617 10.85003 13.99447 100
测试平等:
identical(t(apply(funPartition(10^4), 2, rev)),
comboGeneral(0:100,25,TRUE,"sum","==",100,10^4))
[1] TRUE
答案 1 :(得分:2)
这是一个生成单个目标矩阵的函数 - 可能不是最有效的方法,如果你运行了很多次,你只能得到所有可能的组合。您可以使用lapply()
覆盖rep(5, num)
,如下所示生成num
个norm100 <- function(n=5){
# generate some random values
vec <- sample(0:100, size=n^2)
# put them in a matrix, normalizing to 100 and rounding
mat <- matrix(round((vec / sum(vec)) * 100), nrow=n)
# find out how much the rounding makes us deviate from 100
off_by <- sum(mat) - 100
# get a random matrix element index
modify_idx <- sample(length(mat), 1)
# if adjusting by `off_by` would put us out of the target interval, try again
while ((mat[modify_idx] - off_by) < 0 | (mat[modify_idx] - off_by) > 100){
modify_idx <- sample(length(mat), 1)
}
# once we have one (usually on the first shot), adjust so that mat sums to 100
mat[modify_idx] <- mat[modify_idx] - off_by
return(mat)
}
runs <- 1000
matrices <- lapply(rep(5, runs), norm100)
。
<input class="form-control" id="myInput" type="text" placeholder="Search..">
即使在几万次运行之后我也没有任何重复,但如果你这样做,你总是可以抛弃欺骗。