正则表达式:一个参数和几个参数

时间:2012-10-21 23:32:15

标签: regex bison flex-lexer

请问您能解释一下,如何制作符合(arg1)(arg1, arg2)(arg1, arg2, xarg, zarg)等的正则表达式。每个名称都是一个ASCII字符串,始终以符号{{ 1}}。以下是我尝试过的内容:[A-Za-z]。谢谢!

注意:正则表达式必须在"("[A-Za-z][a-z0-9]*(,)?([A-Za-z][a-z0-9]*)?")"

中工作

2 个答案:

答案 0 :(得分:1)

那样的东西?

>>> import re
>>> s = '''Could you explain, please, how can I make regex that will match (arg1), (arg1, arg2), (arg1, arg2, xarg, zarg), etc. Every name is an ASCII string which always starts with symbol [A-Za-z]. Here is what I've tried: "("[A-Za-z][a-z0-9]*(,)?([A-Za-z][a-z0-9]*)?")". Thanks!'''
>>> re.findall(r'\([A-Za-z]?arg[0-9]?(?:, [A-Za-z]?arg[0-9]?)*\)', s)
['(arg1)', '(arg1, arg2)', '(arg1, arg2, xarg, zarg)']

答案 1 :(得分:1)

我不确定flex是否是正确的工具,因为您通常会使用它将这样的输入分成单独的标记。但是,它当然有可能:

"("[[:alpha:]][[:alnum:]]*(,[[:alpha:]][[:alnum:]]*)*")"

这将与(arg1) (arg1,arg2)匹配,但不会与( arg1 )(arg1, arg2)匹配。如果你想在任何地方忽略空格,它就会变得更加晦涩。

如果你使用lex定义,那么这种东西更具可读性:

ID      [[:alpha:]][[:alnum:]]*

%%

"("{ID}(","{ID})*")"

或,与空格匹配:

/* Make sure you're in the C locale when you compile. Or adjust
 * the definition accordingly. Perhaps you wanted to allow other 
 * characters in IDs.
 */
ID      [[:alpha:]][[:alnum:]]*
/* OWS = Optional White Space.*/
/* Flex defines blank as "space or tab" */
OWS     [[:blank:]]*
COMMA   {OWS}","{OWS}
OPEN    "("{OWS}
CLOSE   {OWS}")"

%%

{OPEN}{ID}({COMMA}{ID})*{CLOSE}  { /* Got a parenthesized list of ids */

最后注意事项:这也与()不匹配;必须至少有一个id。如果你想包括它,你可以在括号之间选择部分:

{OPEN}({ID}({COMMA}{ID})*)?{CLOSE}  { /* Got a parenthesized        */
                                      /* possibly empty list of ids */