Python拆分/查找正则表达式但保留分隔符

时间:2021-01-04 03:15:51

标签: python regex

我有一个字符串 21-12-20 2pm - 10pm 22-12-20 10am - 6pm 24-12-20 1pm - 10pm 28-12-20 8:05pm - 8:47pm 29-12-20 12pm - 4pm,我想将其拆分 [编辑:我正在使用 findall 来执行此操作] 到一个日期列表中。

我正在使用此正则表达式 (\d{1,2}-\d{1,2}-\d{1,2}.*?(?=\d{1,2}-\d{1,2}-\d{1,2})) 来查找匹配项,但我无法匹配到最后一个。

我是在正确的轨道上还是应该换一种方式?

2 个答案:

答案 0 :(得分:2)

此处使用 re.findall,而不是字符串拆分:

inp = "21-12-20 2pm - 10pm  22-12-20 10am - 6pm  24-12-20 1pm - 10pm  28-12-20 8:05pm - 8:47pm  29-12-20 12pm - 4pm"
dates = re.findall(r'\d{1,2}-\d{1,2}-\d{1,2} \d{1,2}(?::\d{1,2})*(?:am|pm) - \d{1,2}(?::\d{1,2})*(?:am|pm)', inp)
print(dates)

打印:

['21-12-20 2pm - 10pm', '22-12-20 10am - 6pm', '24-12-20 1pm - 10pm',
 '28-12-20 8:05pm - 8:47pm', '29-12-20 12pm - 4pm']

如果你真的只想提取日期,那么在模式中的日期周围放置一个捕获组:

inp = "21-12-20 2pm - 10pm  22-12-20 10am - 6pm  24-12-20 1pm - 10pm  28-12-20 8:05pm - 8:47pm  29-12-20 12pm - 4pm"
dates = re.findall(r'(\d{1,2}-\d{1,2}-\d{1,2}) \d{1,2}(?::\d{1,2})*(?:am|pm) - \d{1,2}(?::\d{1,2})*(?:am|pm)', inp)
print(dates)

打印:

['21-12-20', '22-12-20', '24-12-20', '28-12-20', '29-12-20']

答案 1 :(得分:1)

由于日期之间有 2 个空格,您也可以使用 re.split

st = "21-12-20 2pm - 10pm  22-12-20 10am - 6pm  24-12-20 1pm - 10pm  28-12-20 8:05pm - 8:47pm  29-12-20 12pm - 4pm"

print(re.split(r'\s\s', st))

['21-12-20 2pm - 10pm', '22-12-20 10am - 6pm', '24-12-20 1pm - 10pm', '28-12-20 8:05pm - 8:47pm', '29-12-20 12pm - 4pm']
相关问题