我正在寻找一个函数,它返回一个向量的无序组合。例如
x<-c('red','blue','black')
uncomb(x)
[1]'red'
[2]'blue'
[3]'black'
[4]'red','blue'
[5]'blue','black'
[6]'red','black'
[7]'red','blue','black'
我想在某个库中有一个函数可以执行此操作,但是找不到它。我正在尝试permutations
gtool
,但这不是我要找的功能。
答案 0 :(得分:14)
您可以将x
的长度应用于m
函数的combn()
参数。
x <- c("red", "blue", "black")
do.call(c, lapply(seq_along(x), combn, x = x, simplify = FALSE))
# [[1]]
# [1] "red"
#
# [[2]]
# [1] "blue"
#
# [[3]]
# [1] "black"
#
# [[4]]
# [1] "red" "blue"
#
# [[5]]
# [1] "red" "black"
#
# [[6]]
# [1] "blue" "black"
#
# [[7]]
# [1] "red" "blue" "black"
如果您更喜欢矩阵结果,则可以将stringi::stri_list2matrix()
应用于上面的列表。
stringi::stri_list2matrix(
do.call(c, lapply(seq_along(x), combn, x = x, simplify = FALSE)),
byrow = TRUE
)
# [,1] [,2] [,3]
# [1,] "red" NA NA
# [2,] "blue" NA NA
# [3,] "black" NA NA
# [4,] "red" "blue" NA
# [5,] "red" "black" NA
# [6,] "blue" "black" NA
# [7,] "red" "blue" "black"
答案 1 :(得分:1)
我被List All Combinations With combn重新定向到这里,因为这是一个欺骗目标。这是一个老问题,@ RichScriven提供的答案非常好,但我想给社区一些可以说更自然,更有效的选项(最后两个)。
我们首先注意到输出与Power Set非常相似。从powerSet
包调用rje
,我们看到确实我们的输出匹配幂集中的每个元素,除了第一个元素,它等同于Empty Set:
x <- c("red", "blue", "black")
rje::powerSet(x)
[[1]]
character(0) ## empty set equivalent
[[2]]
[1] "red"
[[3]]
[1] "blue"
[[4]]
[1] "red" "blue"
[[5]]
[1] "black"
[[6]]
[1] "red" "black"
[[7]]
[1] "blue" "black"
[[8]]
[1] "red" "blue" "black"
如果您不想要第一个元素,可以轻松地在函数调用结束时添加[-1]
,如下所示:rje::powerSet(x)[-1]
。
接下来的两个解决方案来自较新的软件包arrangements
和RcppAlgos
(我是作者),这将为用户提供更高的效率。这两个包都能够生成Multisets的组合。
为什么这很重要?
可以证明A
的幂集中one-to-one mapping到多集c(rep(emptyElement, length(A)), A)
的所有组合length(A)
选择emptyElement
,其中library(arrangements)
combinations(x = c("",x), k = 3, freq = c(2, rep(1, 3)))
[,1] [,2] [,3]
[1,] "" "" "red"
[2,] "" "" "blue"
[3,] "" "" "black"
[4,] "" "red" "blue"
[5,] "" "red" "black"
[6,] "" "blue" "black"
[7,] "red" "blue" "black"
library(RcppAlgos)
comboGeneral(c("",x), 3, freqs = c(2, rep(1, 3)))
[,1] [,2] [,3]
[1,] "" "" "black"
[2,] "" "" "blue"
[3,] "" "" "red"
[4,] "" "black" "blue"
[5,] "" "black" "red"
[6,] "" "blue" "red"
[7,] "black" "blue" "red"
是空集的表示(如零或空白)。考虑到这一点,请观察:
lapply
如果您不喜欢处理空白元素和/或矩阵,您还可以返回使用lapply(seq_along(x), comboGeneral, v = x)
[[1]]
[,1]
[1,] "black"
[2,] "blue"
[3,] "red"
[[2]]
[,1] [,2]
[1,] "black" "blue"
[2,] "black" "red"
[3,] "blue" "red"
[[3]]
[,1] [,2] [,3]
[1,] "black" "blue" "red"
lapply(seq_along(x), combinations, n = length(x), x = x)
[[1]]
[,1]
[1,] "red"
[2,] "blue"
[3,] "black"
[[2]]
[,1] [,2]
[1,] "red" "blue"
[2,] "red" "black"
[3,] "blue" "black"
[[3]]
[,1] [,2] [,3]
[1,] "red" "blue" "black"
的列表。
do.call(c,
现在我们展示最后两种方法效率更高(注意我从@RichSciven提供的答案中删除了simplify = FALSE
和rje::powerSet
,以便比较类似输出的生成。我还包括{ {1}}好的衡量标准):
set.seed(8128)
bigX <- sort(sample(10^6, 20)) ## With this as an input, we will get 2^20 - 1 results.. i.e. 1,048,575
library(microbenchmark)
microbenchmark(powSetRje = powerSet(bigX),
powSetRich = lapply(seq_along(bigX), combn, x = bigX),
powSetArrange = lapply(seq_along(bigX), function(y) combinations(x = bigX, k = y)),
powSetAlgos = lapply(seq_along(bigX), comboGeneral, v = bigX),
unit = "relative")
Unit: relative
expr min lq mean median uq max neval
powSetRje 52.992681 15.055038 11.091203 13.586952 8.860661 7.347368 100
powSetRich 58.679666 14.864760 10.914700 13.198179 8.675812 6.017437 100
powSetArrange 1.042766 1.062227 1.071404 1.098491 1.126971 1.044827 100
powSetAlgos 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 100
更进一步,arrangements
配备了一个名为type
的参数,允许用户为其输出选择特定格式。其中一个是type = "l"
列表。它类似于在simplify = FALSE
中设置combn
,并允许我们获得类似powerSet
的输出。观察:
do.call(c, lapply(seq_along(x), combinations, n = length(x), x = x, type = "l"))
[[1]]
[1] "red"
[[2]]
[1] "blue"
[[3]]
[1] "black"
[[4]]
[1] "red" "blue"
[[5]]
[1] "red" "black"
[[6]]
[1] "blue" "black"
[[7]]
[1] "red" "blue" "black"
基准:
microbenchmark(powSetRje = powerSet(bigX)[-1],
powSetRich = do.call(c, lapply(seq_along(bigX), combn, x = bigX, simplify = FALSE)),
powSetArrange = do.call(c, lapply(seq_along(bigX), combinations, n = length(bigX), x = bigX, type = "l")),
times = 15, unit = "relative")
Unit: relative
expr min lq mean median uq max neval
powSetRje 4.925559 4.433365 4.013872 3.893674 3.819344 3.609616 15
powSetRich 5.732216 4.975508 4.542482 4.564668 4.288592 4.003765 15
powSetArrange 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 15
答案 2 :(得分:1)
具有矩阵结果的解决方案,无需使用任何外部包:
store <- lapply(
seq_along(x),
function(i) {
out <- combn(x, i)
N <- NCOL(out)
length(out) <- length(x) * N
matrix(out, ncol = N, byrow = TRUE)
})
t(do.call(cbind, store))
[,1] [,2] [,3]
[1,] "red" NA NA
[2,] "blue" NA NA
[3,] "black" NA NA
[4,] "red" "black" NA
[5,] "blue" "blue" NA
[6,] "red" "black" NA
[7,] "red" "blue" "black"