用于进化算法的RegEx语法结构

时间:2019-05-05 10:33:16

标签: python regex genetic-algorithm context-free-grammar evolutionary-algorithm

我对结合机器学习和/或进化算法生成正则表达式感兴趣。我的方法要求我随机构造通过这些算法评估的潜在正则表达式字符串。

有人知道上下文无关的语法吗?该语法表明如何构造正则表达式?遵循一组规则,可以将以下各项组合成一个可行的结构。

例如,使用以下子组件:

basic_elements = {
        "Character Escapes": ["\a", "\b", "\t", "\r", "\v", "\f", "\n", "\e", "\ ", "\c", "\u"],

        "Character Classes": ["[group]", "[^ group]", "[first - last]", "\p{name}", "\w", "\s", "\S", "\d", "\D"],

        "Anchors": ["^", "$", "\A", "\Z", "\z", "\G", "\b", "\B"],

        "Grouping Constructs": ["(subexpression)", "(?< name > subexpression)",
                                "(?< name1 - name2 > subexpression)",
                                "(?: subexpression )", "(?imnsx-imnsx: subexpression )", "(?= subexpression )",
                                "(?! subexpression )", "(?<= subexpression )", "(?<! subexpression )",
                                "(?> subexpression )"],

        "Quantifiers": ["*", "+", "?", "{n, }", "{n, m}", "*?", "+?", "??", "{ n }?", "{ n , }?", "{ n , m }?"],

        "Backreference Constructs": ["\number", "\k< name >"],

        "Alternation Constructs": ["|", "(?( expression ) yes | no )", "(?( name ) yes | no )"],

        "Substitutions": ["$", "${name}", "$$", "$&", "$", "$`", "$'", "$+", "$_", "", "", "", ""],

        "Regular Expression Options": ['i', 'm', 'n', 's', 'x'],

        "Miscellaneous Constructs": ['(?imnsx-imnsx)', '(?# comment )', '#']

    }

预先感谢

0 个答案:

没有答案