可以并入R中原始字符串的所有子字符串组合

时间:2019-09-16 15:08:35

标签: r string substring combinatorics

给出一组n个唯一的有序字符,

如何找到可以并入原始有序字符集(在R中)的所有子字符串组合?

例如,对于n = 5,使用以a开头的字母字符,输入(作为字符元素)和所需的输出(作为字符元素的向量列表)将如下所示,

输入:

ordered.chars <- "abcde"

所需的输出:

ord.substr.list <- list(
c("a","b","c","d","e"),
c("ab","c","d","e"),
c("ab","cd","e"),
c("ab","c","de"),
c("a","bc","d","e"),
c("a","bc","de"),
c("a","b","cd","e"),
c("a","b","c","de"),
c("abc","d","e"),
c("abc","de"),
c("a","bcd","e"),
c("a","b","cde"),
c("ab","cde"),
c("abcd","e"),
c("a","bcde"))

所有列出的字符元素矢量都串联到原始字符元素中的条件的测试:

all(unlist(lapply(ord.substr.list, function(x) paste(x, collapse=""))) %in% ordered.chars)

我的google / stackoverflow搜索导致combn(),在类似情况下它很有用,但在这里似乎没有明显帮助。

1 个答案:

答案 0 :(得分:2)

问题的核心是能够生成power set

这是使用RcppAlgos(我是作者)的解决方案。

library(RcppAlgos)

customPowSetStr <- function(n) {
    len <- n * 2 - 1
    v <- vector("character", length = len)
    v[seq(1, len, 2)] <- letters[1:n]
    v[seq(2, len, 2)] <- ","

    comboGeneral(0:(n - 1), n - 1, freqs = c(n - 2, rep(1, n - 1)), FUN = function(x) {
        temp <- v
        strsplit(paste0(temp[-(x[x > 0] * 2)], collapse = ""), ",")[[1]]
    })
}

customPowSetStr(5)
[[1]]
[1] "ab" "c"  "d"  "e" 

[[2]]
[1] "a"  "bc" "d"  "e" 

[[3]]
[1] "a"  "b"  "cd" "e" 

[[4]]
[1] "a"  "b"  "c"  "de"

[[5]]
[1] "abc" "d"   "e"  

[[6]]
[1] "ab" "cd" "e" 

[[7]]
[1] "ab" "c"  "de"

[[8]]
[1] "a"   "bcd" "e"  

[[9]]
[1] "a"  "bc" "de"

[[10]]
[1] "a"   "b"   "cde"

[[11]]
[1] "abcd" "e"   

[[12]]
[1] "abc" "de" 

[[13]]
[1] "ab"  "cde"

[[14]]
[1] "a"    "bcde"

[[15]]
[1] "abcde"