我有类似"00:00:00 Segment 1 00:20:00 Segment 2 8:00:00 Segment 3"
和"00:00 Segment 1 20:0 Segment 2"
的字符串,并且想使用re.split()
和re.findall()
查找所有时间戳和段名称。但是我很难在没有捕获效果的情况下实现可选组。这是我得到的:
str_1 = "00:00:00 Segment 1 00:20:00 Segment 2 8:00:00 Segment 3"
str_2 = "00:00 Segment 1 20:0 Segment 2"
re.findall(r'\d\d?:\d\d?:\d\d?', str_1)
=> ['00:00:00', '00:20:00', '8:00:00']
re.split(r'\d\d?:\d\d?:\d\d?', str_1)
=> ['', ' Segment 1 ', ' Segment 2 ', ' Segment 3']
以上方法工作正常,但将无法处理str_2
。如果我做了第三对数字,它只会返回可选的组
re.findall(r'\d\d?:\d\d?(:\d\d?)?', str_1)
=> [':00', ':00', ':00']
re.split(r'\d\d?:\d\d?(:\d\d?)?', str_1)
=> ['', ':00', ' Segment 1 ', ':00', ' Segment 2 ', ':00', ' Segment 3']
re.findall(r'\d\d?:\d\d?(:\d\d?)?', str_2)
=> ['', '']
re.split(r'\d\d?:\d\d?(:\d\d?)?', str_2)
=> ['', None, ' Segment 1 ', None, ' Segment 2']
但是,如果我在不捕获的情况下创建了可选组,则str_2
可以正常工作,但是结果与str_1
混合在一起
re.findall(r'\d\d?:\d\d?(?:\d\d?)?', str_1)
=> ['00:00', '00:20', '8:00']
re.split(r'\d\d?:\d\d?(?:\d\d?)?', str_1)
=> ['', ':00 Segment 1 ', ':00 Segment 2 ', ':00 Segment 3']
re.findall(r'\d\d?:\d\d?(?:\d\d?)?', str_2)
=> ['00:00', '20:0']
re.split(r'\d\d?:\d\d?(?:\d\d?)?', str_2)
=> ['', ' Segment 1 ', ' Segment 2']
我想找到一个在str_
和str_2
上都能正常工作的正则表达式,这种正则表达式具有可选组,但没有捕获效果。无论如何要实现?
答案 0 :(得分:0)
似乎您的模式中缺少:
;您需要两个,一个用于?:
,一个用于您的文字:
,丙氨酸:
re.findall(r'\d\d?:\d\d?(?::\d\d?)?', str_1)
=> ['00:00:00', '00:20:00', '8:00:00']