Question

我想找到一个矩阵，其中包含0和1的所有可能组合。这些可能的组合的条件不是单个可能性的信誉，并且对于每个可能的向量，具有指定的1。例如，我有多个对象n = 6，并且有多个样本r = 3，这意味着每个插槽中有6个插槽（可能的组合），其中有多个1 =3。使用choose（） R中的函数，我们可以找到20个可能性。

choose(n=6,k=3) #calculate the number of combinations without replacement/repetition

所有可能组合的理想输出矩阵如下：

1, 1 1 1 0 0 0
2, 1 1 0 1 0 0
3, 1 1 0 0 1 0
4, 1 1 0 0 0 1
5, 1 0 1 1 0 0
6, 1 0 1 0 1 0
7, 1 0 1 0 0 1
8, 0 1 1 1 0 0 
9, 0 1 1 0 1 0
10,0 1 1 0 0 1
11,0 0 1 1 1 0
12,0 0 1 1 0 1
14,0 0 0 1 1 1 
15,1 0 0 1 1 0
16,0 1 0 1 1 0 
17,1 0 0 1 0 1
18,1 0 0 0 1 1

这些可能性应该等于20，但是，我发现只有18。我将这个概念应用于大量数据集，例如代替6个插槽，而3个1分别是200个插槽和100 1个。因此，我需要R中的算法或内置函数来提供输出。谢谢。

Answer 1

t(combn(6,3,function(x)replace(numeric(6),x,1)))
      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    1    1    1    0    0    0
 [2,]    1    1    0    1    0    0
 [3,]    1    1    0    0    1    0
 [4,]    1    1    0    0    0    1
 [5,]    1    0    1    1    0    0
 [6,]    1    0    1    0    1    0
 [7,]    1    0    1    0    0    1
 [8,]    1    0    0    1    1    0
 [9,]    1    0    0    1    0    1
[10,]    1    0    0    0    1    1
[11,]    0    1    1    1    0    0
[12,]    0    1    1    0    1    0
[13,]    0    1    1    0    0    1
[14,]    0    1    0    1    1    0
[15,]    0    1    0    1    0    1
[16,]    0    1    0    0    1    1
[17,]    0    0    1    1    1    0
[18,]    0    0    1    1    0    1
[19,]    0    0    1    0    1    1
[20,]    0    0    0    1    1    1

您可以编写一个函数：

fun=function(n,m)t(combn(n,m,function(x)replace(numeric(n),x,1)))
fun(6,3

Answer 2

这只是多集0:1的排列。有两个库可以有效处理这些问题：RcppAlgos（我是作者）和arrangements。

RcppAlgos::permuteGeneral(1:0, freqs = c(3, 3))

arrangements::permutations(x = 1:0, freq = c(3, 3))

两者都能达到预期的效果。您会注意到，传递的向量是降序排列的（即1:0）。之所以如此，是因为这两个库都按字典顺序产生输出。

如评论中所述，对于您的真实数据，所有发布的解决方案都将无法使用，因为结果数量太大。

RcppAlgos::permuteCount(0:1, freqs = c(100,100))
[1] 9.054851e+58

arrangements::npermutations(x = 0:1, freq = c(100, 100), bigz = TRUE)
Big Integer ('bigz') :
[1] 90548514656103281165404177077484163874504589675413336841320

由于一次生成如此大量的数据根本不可行，因此arrangements和RcppAlgos这两个软件包都提供了替代方法，可以使人们解决更大的问题。

安排

对于软件包arrangements，您可以设置一个迭代器，该迭代器允许用户一次生成组合/排列 n ，而避免了生成 all 。

library(arrangements) iperm <- ipermutations(x = 1:0, freq = c(3,3)) ## get the first 5 permutations iperm$getnext(d = 5) [,1] [,2] [,3] [,4] [,5] [,6] [1,] 1 1 1 0 0 0 [2,] 1 1 0 1 0 0 [3,] 1 1 0 0 1 0 [4,] 1 1 0 0 0 1 [5,] 1 0 1 1 0 0 ## get the next 5 permutations iperm$getnext(d = 5) [,1] [,2] [,3] [,4] [,5] [,6] [1,] 1 0 1 0 1 0 [2,] 1 0 1 0 0 1 [3,] 1 0 0 1 1 0 [4,] 1 0 0 1 0 1 [5,] 1 0 0 0 1 1

RcppAlgos

对于RcppAlgos，有参数lower和upper允许生成特定的块。

library(RcppAlgos) permuteGeneral(1:0, freqs = c(3,3), lower = 1, upper = 5) [,1] [,2] [,3] [,4] [,5] [,6] [1,] 1 1 1 0 0 0 [2,] 1 1 0 1 0 0 [3,] 1 1 0 0 1 0 [4,] 1 1 0 0 0 1 [5,] 1 0 1 1 0 0 permuteGeneral(1:0, freqs = c(3,3), lower = 6, upper = 10) [,1] [,2] [,3] [,4] [,5] [,6] [1,] 1 0 1 0 1 0 [2,] 1 0 1 0 0 1 [3,] 1 0 0 1 1 0 [4,] 1 0 0 1 0 1 [5,] 1 0 0 0 1 1

由于这些块是独立生成的，因此可以轻松地并行生成和分析：

library(parallel) mclapply(seq(1,20,5), function(x) { a <- permuteGeneral(1:0, freqs = c(3,3), lower = x, upper = x + 4) ## Do some analysis }, mc.cores = detectCores() - 1)

对于这个小例子，您不会注意到任何加速，但是随着结果数量的增加，会有明显的进步。

我在写给问题summary的R: Permutations and combinations with/without replacement and for distinct/non-distinct items/multiset中有关于此主题的更多信息。

查找0和1的所有组合，以及1的特定数目

2 个答案:

安排

RcppAlgos