我希望在列表中生成元素的唯一序列,其中某些元素在R中不是唯一的
sequence <- c(1,0,1,0)
例如:
result<-function(sequence)
result:
seq1 seq2 seq3 seq4 seq5 seq6
1 1 1 0 0 0 1
2 0 1 0 1 1 0
3 1 0 1 0 1 0
4 0 0 1 1 0 1
请注意,所有序列都包含原始序列中的每个元素, 这样序列的总和总是2
gtools返回“不同元素太少”
result <- gtools::permutations(4, 4, coseq)
我找不到任何直接解决此问题的SO帖子,而是允许元素重复:Creating combination of sequences
expand.grid
和不同长度的序列可以实现。
编辑: 上面是一个最小的示例,理想情况下,它将按以下顺序工作:
sequence = c(0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1)
解决方案不要生成重复项,然后再将其删除,这在某种程度上很重要,因为如果生成重复项,则较长的序列(例如20或30)将在计算上非常耗时。
答案 0 :(得分:3)
m = apply(gtools::permutations(2, 4, 1:4, repeats.allowed = TRUE), 1, function(x) sequence[x])
m[,colSums(m) == 2]
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 1 1 1 0 0 0
#[2,] 1 0 0 1 1 0
#[3,] 0 1 0 1 0 1
#[4,] 0 0 1 0 1 1
答案 1 :(得分:3)
为此专门构建了几个软件包。
首先使用arrangements
软件包:
## sequence is a bad name as it is a base R function so we use s instead
s <- c(1,0,1,0)
arrangements::permutations(unique(s), length(s), freq = table(s))
[,1] [,2] [,3] [,4]
[1,] 1 1 0 0
[2,] 1 0 1 0
[3,] 1 0 0 1
[4,] 0 1 1 0
[5,] 0 1 0 1
[6,] 0 0 1 1
接下来,我们有RcppAlgos
(我是作者):
RcppAlgos::permuteGeneral(unique(s), length(s), freqs = table(s))
[,1] [,2] [,3] [,4]
[1,] 1 1 0 0
[2,] 1 0 1 0
[3,] 1 0 0 1
[4,] 0 1 1 0
[5,] 0 1 0 1
[6,] 0 0 1 1
它们都非常有效。为了给您一个想法,对于OP的实际需求,其他方法将失败(我认为矩阵的行数是有限制的……2 ^ 31-1,虽然不确定),或者采用因为它们将不得不在进一步处理之前生成16! ~= 2.092e+13
排列,因此需要很长时间。但是,使用这两个软件包,返回是即时的:
## actual example needed by OP
sBig <- c(0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1)
system.time(a <- arrangements::permutations(unique(sBig), length(sBig), freq = table(sBig)))
user system elapsed
0.001 0.001 0.002
system.time(b <- RcppAlgos::permuteGeneral(unique(sBig), length(sBig), freqs = table(sBig)))
user system elapsed
0.001 0.001 0.002
identical(a, b)
[1] TRUE
dim(a)
[1] 11440 16
答案 2 :(得分:2)
自从您提到gtools::permutations
以来,您就可以这样做
首先生成所有排列
m <- apply(permutations(4, 4, 1:length(sequence)), 1, function(x) sequence[x])
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
#[1,] 1 1 1 1 1 1 0 0 0 0 0 0 1 1
#[2,] 0 0 1 1 0 0 1 1 1 1 0 0 1 1
#[3,] 1 0 0 0 0 1 1 0 1 0 1 1 0 0
#[4,] 0 1 0 0 1 0 0 1 0 1 1 1 0 0
# [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24]
#[1,] 1 1 1 1 0 0 0 0 0 0
#[2,] 0 0 0 0 1 1 0 0 1 1
#[3,] 1 0 1 0 0 1 1 1 1 0
#[4,] 0 1 0 1 1 0 1 1 0 1
然后删除重复的列(从1和0的可分辨性中删除)
m[, !duplicated(apply(m, 2, paste, collapse = ""))]
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 1 1 1 0 0 0
#[2,] 0 0 1 1 1 0
#[3,] 1 0 0 1 0 1
#[4,] 0 1 0 0 1 1