我对结合机器学习和/或进化算法生成正则表达式感兴趣。我的方法要求我随机构造通过这些算法评估的潜在正则表达式字符串。
有人知道上下文无关的语法吗?该语法表明如何构造正则表达式?遵循一组规则,可以将以下各项组合成一个可行的结构。
例如,使用以下子组件:
basic_elements = {
"Character Escapes": ["\a", "\b", "\t", "\r", "\v", "\f", "\n", "\e", "\ ", "\c", "\u"],
"Character Classes": ["[group]", "[^ group]", "[first - last]", "\p{name}", "\w", "\s", "\S", "\d", "\D"],
"Anchors": ["^", "$", "\A", "\Z", "\z", "\G", "\b", "\B"],
"Grouping Constructs": ["(subexpression)", "(?< name > subexpression)",
"(?< name1 - name2 > subexpression)",
"(?: subexpression )", "(?imnsx-imnsx: subexpression )", "(?= subexpression )",
"(?! subexpression )", "(?<= subexpression )", "(?<! subexpression )",
"(?> subexpression )"],
"Quantifiers": ["*", "+", "?", "{n, }", "{n, m}", "*?", "+?", "??", "{ n }?", "{ n , }?", "{ n , m }?"],
"Backreference Constructs": ["\number", "\k< name >"],
"Alternation Constructs": ["|", "(?( expression ) yes | no )", "(?( name ) yes | no )"],
"Substitutions": ["$", "${name}", "$$", "$&", "$", "$`", "$'", "$+", "$_", "", "", "", ""],
"Regular Expression Options": ['i', 'm', 'n', 's', 'x'],
"Miscellaneous Constructs": ['(?imnsx-imnsx)', '(?# comment )', '#']
}
预先感谢