鉴于时间戳可以有多种结构,即
目前我正在使用re.findall()
|
替代运营商。
是否有更有效的方法来查找所有上述可能类型的时间戳 在字符串中,而不是以下内容:
aString = "the cat (01:03) sat on [01:01:01] the ( 9:13 )mat( 1:10:11)."
bString = "the cat 01:14:23.447 sat on the mat"
cString = "the cat 01:14:23.447 --> 01:17:10.239 sat on the mat"
dString = "the cat 323:14 sat on the mat"
v = re.findall('\d{2}:\d{2}:\d{2}|\d:\d{2}:\d{2}|\d{3}:\d{2}|\d{2}:\d{2}|\d:\d{2}',aString)
x = re.findall('\d{2}:\d{2}:\d{2}|\d:\d{2}:\d{2}|\d{3}:\d{2}|\d{2}:\d{2}|\d:\d{2}',bString)
y = re.findall('\d{2}:\d{2}:\d{2}|\d:\d{2}:\d{2}|\d{3}:\d{2}|\d{2}:\d{2}|\d:\d{2}',cString)
z = re.findall('\d{2}:\d{2}:\d{2}|\d:\d{2}:\d{2}|\d{3}:\d{2}|\d{2}:\d{2}|\d:\d{2}',dString)
v
['01:03', '01:01:01', '9:13', '1:10:11']
x
['01:14:23']
y
['01:14:23', '01:17:10']
z
['323:14']
注意:如果它们包含在时间戳中,我不在乎毫秒。
答案 0 :(得分:1)
您可以使用:
aString = "the cat (01:03) sat on [01:01:01] the ( 9:13 )mat( 1:10:11)."
bString = "the cat 01:14:23.447 sat on the mat"
cString = "the cat 01:14:23.447 --> 01:17:10.239 sat on the mat"
dString = "the cat 323:14 sat on the mat"
v = re.findall('\d{1,3}(?::\d{2}){1,2}', aString)
x = re.findall('\d{1,3}(?::\d{2}){1,2}', bString)
y = re.findall('\d{1,3}(?::\d{2}){1,2}', cString)
z = re.findall('\d{1,3}(?::\d{2}){1,2}', dString)
print(v, x, y, z, sep='\n')
输出:
['01:03', '01:01:01', '9:13', '1:10:11']
['01:14:23']
['01:14:23', '01:17:10']
['323:14']
说明:
\d{1,3}
匹配至少1个,最多3个数字(?:
启动非捕获组
:\d{2}
匹配冒号和2位数字)
结束组{1,2}
匹配前一组至少1次,最多2次答案 1 :(得分:1)
如果您正在匹配但未同时验证,则在正则表达式下就足够了:
\d+:\d+(?::\d+)?
故障:
\d+:\d+
匹配数字冒号数字字符串(?:
非限制组的开始
:\d+
匹配冒号数字)?
NCG结束,可选