根据模板生成所有字符串组合

时间:2017-06-23 19:20:06

标签: regex algorithm combinations

如何根据模板生成字符串的所有组合?

例如: -

的模板字符串
  

" {我|我们}想要{| 2 | 3 | 4} {apples | pears}"

花括号" {...}"识别一个或多个单词,每个单词用" |"。

分隔

该类应该生成包含每个单词组中每个单词组合的字符串。

我知道它的有限自动机,还有正则表达式。如何有效地生成组合?

例如

  

G [0] [j] [想要] G [1] [j] G [2] [j]"

G[0] = {I, We}
G[1] = {2, 3, 4}
G[2] = {apples, pears}

首先,生成所有可能的组合c = [0..1] [0..2] [0..1]:

000
001
010
011
020
021
100
101
110
111
120    
121

然后每个c用G [i]替换G [i] [j] [c [i]]

4 个答案:

答案 0 :(得分:2)

Shell glob

$ for q in {I,We}\ want\ {2,3,4}\ {apples,pears}; do echo "$q" ; done
I want 2 apples
I want 2 pears
I want 3 apples
I want 3 pears
I want 4 apples
I want 4 pears
We want 2 apples
We want 2 pears
We want 3 apples
We want 3 pears
We want 4 apples
We want 4 pears

答案 1 :(得分:1)

到目前为止,我发现的这个问题最有效的解决方案是Python模块sre_yield

  

sre_yield的目标是有效地生成 所有 值   匹配给定的正则表达式,或计算可能的匹配   有效。

我强调的重点。

将其应用于您声明的问题:将模板制作为正则表达式模式,并在sre_yield中使用它以获取所有可能的组合或计算可能的匹配:

import sre_yield
result = set(sre_yield.AllStrings("(I|We) want (|2|3|4) (apples|pears)"))
result.__len__()
result

输出:

16
{'I want  apples',
 'I want  pears',
 'I want 2 apples',
 'I want 2 pears',
 'I want 3 apples',
 'I want 3 pears',
 'I want 4 apples',
 'I want 4 pears',
 'We want  apples',
 'We want  pears',
 'We want 2 apples',
 'We want 2 pears',
 'We want 3 apples',
 'We want 3 pears',
 'We want 4 apples',
 'We want 4 pears'}

PS:我使用list代替项目页面上显示的set,以避免重复。如果这不是您想要的,请使用列表。

答案 2 :(得分:1)

原则是:

  • 正则表达式 - > NFA
  • NFA - >最小DFA
  • DFS-遍历DFA(收集所有字符)

实施该原则,例如在RexLex中:

DeterministicAutomaton dfa = Pattern.compileGenericAutomaton("(I|We) want (2|3|4)? (apples|pears)")
  .toAutomaton(new FromGenericAutomaton.ToMinimalDeterministicAutomaton());
if (dfa.getProperty().isAcyclic()) {
  for (String s : dfa.getSamples(1000)) {
    System.out.println(s);
  }
}

答案 3 :(得分:0)

将每组字符串{...}转换为string array,这样就有了n个数组。 因此,对于"{I|We} want {|2|3|4} {apples|pears}",我们将有4个数组。

将每个数组放入另一个数组中。在我的示例中,我将调用collection

这是Java代码,但它很简单,您应该能够将其转换为任何语言。我没有测试,但它应该工作。

void makeStrings(String[][] wordSet, ArrayList<String> collection) {
       makeStrings(wordSet, collection, "", 0, 0);
}

void makeStrings(String[][] wordSet, ArrayList<String> collection, String currString, int x_pos, int y_pos) {

    //If there are no more wordsets in the whole set add the string (this means 1 combination is completed)
    if (x_pos >= wordSet.length) {
        collection.add(currString);
        return; 
    }


        //Else if y_pos is outof bounds (meaning no more words within the smaller set {...} return
    else if (y_pos >= wordSet[x_pos].length) { 
        return;
    } 



    else {
            //Generate 2 new strings, one to send "vertically " and one "horizontally"
            //This string accepts the current word at x.y and then moves to the next word subset
            String combo_x = currString + " " + wordSet[x_pos][y_pos];
            makeStrings(wordSet, collection, combo_x, x_pos + 1, 0);

            //Create a copy of the string and move to the next string within the same subset
            String combo_y = currString;
            makeStrings(wordSet, collection, combo_y, x_pos , y_pos  + 1);
        }
    }

*编辑更正