您好我在以下链接中遇到了正则表达式的问题。
https://regex101.com/r/wU4xK1/1
它几乎匹配所有模式。但当它遇到一些角色或换行时,我正在努力。
我的正则表达式是:
(\b(?:(jan|january|feb|february|mar|march|apr|april|may|jun|june|jul|july|aug|august|set|sep|september|oct|october|nov|november|dec|december)[/\.\s',’-]{0,4}\d{2,4}|(jan|january|feb|february|mar|march|apr|april|may|jun|june|jul|july|aug|august|set|sep|september|oct|october|nov|november|dec|december))[/\r-–––,]{0,4}[a-zA-Z]{3,8}[/\.\s',’-]{0,2}[\s]{0,4}\d{2,4})
我的文字是:
July 2005 – December - 2006
(Nov '12 - Feb 12)
(Nov 12 - Feb 12 )
july 2005 – Dec 2012 ## Note here. If i press enter after Dec 2012 I will get a match. Dont know why ?
答案 0 :(得分:2)
只需将所有捕获组转为非捕获组,然后将整个模式包含在单个捕获组中。
((?:\b(?:(?:jan|january|feb|february|mar|march|apr|april|may|jun|june|jul|july|aug|august|set|sep|september|oct|october|nov|november|dec|december)[/\.\s',’-]{0,4}\d{2,4}|(jan|january|feb|february|mar|march|apr|april|may|jun|june|jul|july|aug|august|set|sep|september|oct|october|nov|november|dec|december))[/\r-–––,]{0,4}[a-zA-Z]{3,8}[/\.\s',’-]{0,2}[\s]{0,4}\d{2,4}))
>>> s = '''July 2005 – December - 2006
(Nov '12 - Feb 12)
(Nov 12 - Feb 12 )
july 2005 – Dec 2012 ## Note here. If i press enter after Dec 2012 I will get a match. Dont know why ?'''
>>> re.findall(r"(?mi)((?:\b(?:(?:jan|january|feb|february|mar|march|apr|april|may|jun|june|jul|july|aug|august|set|sep|september|oct|october|nov|november|dec|december)[/\.\s',’-]{0,4}\d{2,4}|(jan|january|feb|february|mar|march|apr|april|may|jun|june|jul|july|aug|august|set|sep|september|oct|october|nov|november|dec|december))[/\r-–––,]{0,4}[a-zA-Z]{3,8}[/\.\s',’-]{0,2}[\s]{0,4}\d{2,4}))", s)
[('July 2005 – December - 2006', ''), ("Nov '12 - Feb 12", ''), ('Nov 12 - Feb 12', ''), ('july 2005 – Dec 2012', '')]
>>> m = re.findall(r"(?mi)((?:\b(?:(?:jan|january|feb|february|mar|march|apr|april|may|jun|june|jul|july|aug|august|set|sep|september|oct|october|nov|november|dec|december)[/\.\s',’-]{0,4}\d{2,4}|(jan|january|feb|february|mar|march|apr|april|may|jun|june|jul|july|aug|august|set|sep|september|oct|october|nov|november|dec|december))[/\r-–––,]{0,4}[a-zA-Z]{3,8}[/\.\s',’-]{0,2}[\s]{0,4}\d{2,4}))", s)
>>> [(x) for x,y in m]
['July 2005 – December - 2006', "Nov '12 - Feb 12", 'Nov 12 - Feb 12', 'july 2005 – Dec 2012']
(?mi)
这里我们结合了多行和不区分大小写的修饰符。
答案 1 :(得分:1)
您的正则表达式确实有效,但您必须删除正则表达式末尾的空行。 看到 https://regex101.com/r/wU4xK1/3
答案 2 :(得分:1)
除了正确的Aaron的评论(删除该换行符后,会显示比赛),我还想提及\ s匹配 [\ r \ n]中的任何空格字符n \ t \ f] 类,因此您可以通过将组限制为 [\ t \ t \ f] 来避免捕获换行符。