R - 从给定约束的2个向量生成所有组合

时间:2012-05-09 16:05:31

标签: r combinations

我想生成两个向量的所有组合,给定两个约束:第一个向量中永远不能超过3个字符,并且第二个向量中必须始终至少有一个字符。我还想改变组合中的最终字符数。

例如,这里有两个向量:

vec1=c("A","B","C","D")
vec2=c("W","X","Y","Z")

说我想在组合中使用3个字符。可接受的排列可能是:"A" "B" "X""A" "Y" "Z"。不可接受的排列是"A" "B" "C",因为vec2中至少没有一个字符。

现在说我想在组合中使用5个字符。可接受的排列可能是:"A" "C" "Z" "Y""A" "Y" "Z" "X"。不可接受的排列是:"A" "C" "D" "B" "X",因为vec2中有> 3个字符。

我想我可以使用expand.grid生成所有组合,然后以某种方式生成子集,但必须有一种更简单的方法。提前谢谢!

1 个答案:

答案 0 :(得分:5)

我不确定这更容易,但你可以放弃那些不符合你的条件的排列:

  1. 生成vec1中可接受的所有组合。

  2. 生成vec2中可接受的所有组合。

  3. 生成所有组合,从1. +一个解决方案中获取一个解决方案2.此后我将使用条件3进行过滤。

  4. (如果你正在寻找组合,你就完成了,否则:)在每个结果中产生所有字母的排列。

  5. 现在,我们有

    vec1 <- LETTERS [1:4]
    vec2 <- LETTERS [23:26]
    
    ## lists can eat up lots of memory, so use character vectors instead.
    combine <- function (x, y) 
      combn (y, x, paste, collapse = "")
    
    res1 <- unlist (lapply (0:3, combine, vec1))
    res2 <- unlist (lapply (1:length (vec2), combine, vec2))
    

    现在我们有:

    > res1
     [1] ""    "A"   "B"   "C"   "D"   "AB"  "AC"  "AD"  "BC"  "BD"  "CD"  "ABC"
    [13] "ABD" "ACD" "BCD"
    > res2
     [1] "W"    "X"    "Y"    "Z"    "WX"   "WY"   "WZ"   "XY"   "XZ"   "YZ"  
    [11] "WXY"  "WXZ"  "WYZ"  "XYZ"  "WXYZ"
    
    res3 <- outer (res1, res2, paste0)
    res3 <- res3 [nchar (res3) == 5]
    

    所以你在这里:

    > res3
     [1] "ABCWX" "ABDWX" "ACDWX" "BCDWX" "ABCWY" "ABDWY" "ACDWY" "BCDWY" "ABCWZ"
    [10] "ABDWZ" "ACDWZ" "BCDWZ" "ABCXY" "ABDXY" "ACDXY" "BCDXY" "ABCXZ" "ABDXZ"
    [19] "ACDXZ" "BCDXZ" "ABCYZ" "ABDYZ" "ACDYZ" "BCDYZ" "ABWXY" "ACWXY" "ADWXY"
    [28] "BCWXY" "BDWXY" "CDWXY" "ABWXZ" "ACWXZ" "ADWXZ" "BCWXZ" "BDWXZ" "CDWXZ"
    [37] "ABWYZ" "ACWYZ" "ADWYZ" "BCWYZ" "BDWYZ" "CDWYZ" "ABXYZ" "ACXYZ" "ADXYZ"
    [46] "BCXYZ" "BDXYZ" "CDXYZ" "AWXYZ" "BWXYZ" "CWXYZ" "DWXYZ"
    

    如果您希望将结果拆分为单个字母:

    res <- matrix (unlist (strsplit (res3, "")), nrow = length (res3), byrow = TRUE)
    > res
          [,1] [,2] [,3] [,4] [,5]
     [1,] "A"  "B"  "C"  "W"  "X" 
     [2,] "A"  "B"  "D"  "W"  "X" 
     [3,] "A"  "C"  "D"  "W"  "X" 
     [4,] "B"  "C"  "D"  "W"  "X" 
    

    (剪辑)

    [51,] "C"  "W"  "X"  "Y"  "Z" 
    [52,] "D"  "W"  "X"  "Y"  "Z" 
    

    你的组合是什么。