请问您能解释一下,如何制作符合(arg1)
,(arg1, arg2)
,(arg1, arg2, xarg, zarg)
等的正则表达式。每个名称都是一个ASCII字符串,始终以符号{{ 1}}。以下是我尝试过的内容:[A-Za-z]
。谢谢!
注意:正则表达式必须在"("[A-Za-z][a-z0-9]*(,)?([A-Za-z][a-z0-9]*)?")"
答案 0 :(得分:1)
那样的东西?
>>> import re
>>> s = '''Could you explain, please, how can I make regex that will match (arg1), (arg1, arg2), (arg1, arg2, xarg, zarg), etc. Every name is an ASCII string which always starts with symbol [A-Za-z]. Here is what I've tried: "("[A-Za-z][a-z0-9]*(,)?([A-Za-z][a-z0-9]*)?")". Thanks!'''
>>> re.findall(r'\([A-Za-z]?arg[0-9]?(?:, [A-Za-z]?arg[0-9]?)*\)', s)
['(arg1)', '(arg1, arg2)', '(arg1, arg2, xarg, zarg)']
答案 1 :(得分:1)
我不确定flex是否是正确的工具,因为您通常会使用它将这样的输入分成单独的标记。但是,它当然有可能:
"("[[:alpha:]][[:alnum:]]*(,[[:alpha:]][[:alnum:]]*)*")"
这将与(arg1)
(arg1,arg2)
匹配,但不会与( arg1 )
或(arg1, arg2)
匹配。如果你想在任何地方忽略空格,它就会变得更加晦涩。
如果你使用lex定义,那么这种东西更具可读性:
ID [[:alpha:]][[:alnum:]]*
%%
"("{ID}(","{ID})*")"
或,与空格匹配:
/* Make sure you're in the C locale when you compile. Or adjust
* the definition accordingly. Perhaps you wanted to allow other
* characters in IDs.
*/
ID [[:alpha:]][[:alnum:]]*
/* OWS = Optional White Space.*/
/* Flex defines blank as "space or tab" */
OWS [[:blank:]]*
COMMA {OWS}","{OWS}
OPEN "("{OWS}
CLOSE {OWS}")"
%%
{OPEN}{ID}({COMMA}{ID})*{CLOSE} { /* Got a parenthesized list of ids */
最后注意事项:这也与()
不匹配;必须至少有一个id。如果你想包括它,你可以在括号之间选择部分:
{OPEN}({ID}({COMMA}{ID})*)?{CLOSE} { /* Got a parenthesized */
/* possibly empty list of ids */