re.search()逻辑上或re.search()中的两种模式

时间:2015-04-09 21:12:50

标签: python regex findall

我有以下字符串。

Page load for http://xxxx?roxy=www.yahoo.com&sendto=https://mywebsite?ent took 4001 ms (Ne: 167 ms, Se: 2509 ms, Xe: 1325 ms)<br><br><br>Topic: Yahoo!! My website is a good website | Mywebsite<br>

OR

Page load for http://xxxx?roxy=www.yahoo.com&sendto=https://mywebsite?ent took too long (12343 ms Ne: 167 ms, Se: 2509 ms, Xe: 1325 ms)<br><br><br>Topic: Yahoo!! My website is a good website | Mywebsite<br>

我想从4001 OR中提取上面的12343ent took 4001 ms ent took too long (12343 ms并将其分配给变量

tt = int(re.search(r"\?ent\s*took\s*(\d+)",message).group(1))

这个正则表达式确实与第一部分匹配,并且确实返回4001.我如何逻辑或表达式r"\?ent\s*\took\s*too\s*long\s*\((\d+)" 从第二部分中提取12343?

3 个答案:

答案 0 :(得分:3)

正则表达式开头的问号不会跟随任何可以选择的问题。如果您想在那里匹配文字问号,请写下\?

x = int(re.findall(r"\?ent\s*took\s*([^m]*)",message)[0])

答案 1 :(得分:1)

首先,您需要在模式的前导处转义?,因为?标记是正则表达式字符,并且使字符串可选,并且必须以字符串开头!因此,如果您想要数学?,您还需要使用\?作为一种更有效的方式,您可以在模式中使用re.search\d+,并拒绝额外的索引:

>>> int(re.search(r"\?ent\s*took\s*(\d+)",s).group(1))
4001

对于第二个例子,你可以这样做:

>>> re.search(r'\((\d+)',s).group(1)
'12343'

对于两种情况下的匹配,请使用以下模式:

(\d+)[\s\w]+\(|\((\d+)

Demo

>>> s1="Page load for http://xxxx?roxy=www.yahoo.com&sendto=https://mywebsite?ent took too long (12343 ms Ne: 167 ms, Se: 2509 ms, Xe: 1325 ms)<br><br><br>Topic: Yahoo!! My website is a good website | Mywebsite<br>"
>>> s2="Page load for http://xxxx?roxy=www.yahoo.com&sendto=https://mywebsite?ent took 4001 ms (Ne: 167 ms, Se: 2509 ms, Xe: 1325 ms)<br><br><br>Topic: Yahoo!! My website is a good website | Mywebsite<br>"
>>> re.search(r'(\d+)[\s\w]+\(|\((\d+)',s1).group(2)
'12343'
>>> re.search(r'(\d+)[\s\w]+\(|\((\d+)',s2).group(1)
'4001'

答案 2 :(得分:1)

这一次匹配两种模式并提取所需的数字:

tt = int(re.search(r"\?ent took (too long \()?(?P<num>\d+)",message).group('num'))