我正在尝试使用以下正则表达式解析一些日期。但它似乎只与一些连字符一起使用。请看下面的链接。它与某些日期不匹配。我添加了特定连字符,但它仅适用于rgex101.com。不是在Python中。
((?:(?<= )|^)((?:0?[1-9]|1[0-2])?[/-] ?(?:[12][0-9])?[0-9]{2})\b(?:[\s-]+)[/-]{0,2}\s*\b((?:0?[1-9]|1[0-2]) ?[/-] ?(?:[12][0-9])?[0-9]{2})\b)
答案 0 :(得分:1)
您只需添加多行修改器。
>>> re.findall(r'(?m)((?:(?<= )|^)((?:0?[1-9]|1[0-2])?[/-] ?(?:[12][0-9])?[0-9]{2})\b(?:[\s-]+)[/-]{0,2}\s*\b((?:0?[1-9]|1[0-2]) ?[/-] ?(?:[12][0-9])?[0-9]{2})\b)', s)
[('02/2003-01/2005', '02/2003', '01/2005'), ('01/2002-01/2003', '01/2002', '01/2003'), ('01/2005-02/2007', '01/2005', '02/2007'), ('02/2003-01/2005', '02/2003', '01/2005'), ('01/2002-01/2003', '01/2002', '01/2003'), ('03/1999-01/2002', '03/1999', '01/2002')]
每当使用锚点时,最好添加多线修改器。
示例:强>
>>> s = """ 01/2013-11/2014
01/2010-12/2012
01/2009-01/2010
03/2007-12/2009
02/2003-01/2005
01/2002-01/2003
01/2005-02/2007
02/2003-01/2005,
01/2002-01/2003,
, 03/1999-01/2002,
"""
>>> re.findall(r'((?:(?<= )|^)((?:0?[1-9]|1[0-2])?[/-] ?(?:[12][0-9])?[0-9]{2})\b(?:[\s-]+)[/-]{0,2}\s*\b((?:0?[1-9]|1[0-2]) ?[/-] ?(?:[12][0-9])?[0-9]{2})\b)', s)
[('02/2003-01/2005', '02/2003', '01/2005'), ('02/2003-01/2005', '02/2003', '01/2005'), ('01/2002-01/2003', '01/2002', '01/2003'), ('03/1999-01/2002', '03/1999', '01/2002')]
>>> re.findall(r'(?m)((?:(?<= )|^)((?:0?[1-9]|1[0-2])?[/-] ?(?:[12][0-9])?[0-9]{2})\b(?:[\s-]+)[/-]{0,2}\s*\b((?:0?[1-9]|1[0-2]) ?[/-] ?(?:[12][0-9])?[0-9]{2})\b)', s)
[('02/2003-01/2005', '02/2003', '01/2005'), ('01/2002-01/2003', '01/2002', '01/2003'), ('01/2005-02/2007', '01/2005', '02/2007'), ('02/2003-01/2005', '02/2003', '01/2005'), ('01/2002-01/2003', '01/2002', '01/2003'), ('03/1999-01/2002', '03/1999', '01/2002')]
看到匹配结果有差异。