python中的正则表达式似乎不适用于某些连字符

时间:2015-02-12 08:59:51

标签: python regex

我正在尝试使用以下正则表达式解析一些日期。但它似乎只与一些连字符一起使用。请看下面的链接。它与某些日期不匹配。我添加了特定连字符,但它仅适用于rgex101.com。不是在Python中。

((?:(?<= )|^)((?:0?[1-9]|1[0-2])?[/-] ?(?:[12][0-9])?[0-9]{2})\b(?:[\s-]+)[/-]{0,2}\s*\b((?:0?[1-9]|1[0-2]) ?[/-] ?(?:[12][0-9])?[0-9]{2})\b)

https://regex101.com/r/vI6qN1/1

1 个答案:

答案 0 :(得分:1)

您只需添加多行修改器

>>> re.findall(r'(?m)((?:(?<= )|^)((?:0?[1-9]|1[0-2])?[/-] ?(?:[12][0-9])?[0-9]{2})\b(?:[\s-]+)[/-]{0,2}\s*\b((?:0?[1-9]|1[0-2]) ?[/-] ?(?:[12][0-9])?[0-9]{2})\b)', s)
[('02/2003-01/2005', '02/2003', '01/2005'), ('01/2002-01/2003', '01/2002', '01/2003'), ('01/2005-02/2007', '01/2005', '02/2007'), ('02/2003-01/2005', '02/2003', '01/2005'), ('01/2002-01/2003', '01/2002', '01/2003'), ('03/1999-01/2002', '03/1999', '01/2002')]

每当使用锚点时,最好添加多线修改器。

示例:

>>> s = """     01/2013-11/2014
        01/2010-12/2012

        01/2009-01/2010


        03/2007-12/2009


 02/2003-01/2005

01/2002-01/2003


01/2005-02/2007  

 02/2003-01/2005, 

 01/2002-01/2003, 


, 03/1999-01/2002,
"""
>>> re.findall(r'((?:(?<= )|^)((?:0?[1-9]|1[0-2])?[/-] ?(?:[12][0-9])?[0-9]{2})\b(?:[\s-]+)[/-]{0,2}\s*\b((?:0?[1-9]|1[0-2]) ?[/-] ?(?:[12][0-9])?[0-9]{2})\b)', s)
[('02/2003-01/2005', '02/2003', '01/2005'), ('02/2003-01/2005', '02/2003', '01/2005'), ('01/2002-01/2003', '01/2002', '01/2003'), ('03/1999-01/2002', '03/1999', '01/2002')]
>>> re.findall(r'(?m)((?:(?<= )|^)((?:0?[1-9]|1[0-2])?[/-] ?(?:[12][0-9])?[0-9]{2})\b(?:[\s-]+)[/-]{0,2}\s*\b((?:0?[1-9]|1[0-2]) ?[/-] ?(?:[12][0-9])?[0-9]{2})\b)', s)
[('02/2003-01/2005', '02/2003', '01/2005'), ('01/2002-01/2003', '01/2002', '01/2003'), ('01/2005-02/2007', '01/2005', '02/2007'), ('02/2003-01/2005', '02/2003', '01/2005'), ('01/2002-01/2003', '01/2002', '01/2003'), ('03/1999-01/2002', '03/1999', '01/2002')]

看到匹配结果有差异。