Question

我想从文本中提取电话号码。当所有数字都显示在一行上时，我可以从文本中提取电话号码。但是，如果下一行出现某些数字，则表示正则表达式无法正常工作。

这是我的文字：

I will be out of the office. Please send me an email and text my mobile +45
20 32 40 08 if any urgency.

在上面的文本中，第一行是 +45 ，第二行是 20 32 40 08 。当文本为上述文本时，我无法从文本中提取电话号码。当同一行上出现数字时，表示一切正常。

这是我的正则表达式：

reg_phonestyle = re.compile(r'(\d{2}[-\/\.\ \s]??\d{2}[-\/\.\ \s]??\d{2}[-\/\.\ \s]??\d{2}[-\/\.\ \s]??\d{2}|\(\d{3}\)\s*\d{3}[-\/\.\ \s]??\d{4}|\d{3}[-\/\.\ \s]??\d{4})')

Answer 1

您可以指定其他标志以执行MULTILINE搜索。给出您的示例，我提出以下解决方案：

import re

input_str = '''                                                                 

I will be out of the office. Please send me an email and text my mobile +45     
20 32 40 08 if any urgency.                                                     

'''
phone_reg = re.compile("([0-9]{2,4}[-.\s]{,1}){5}", re.MULTILINE)

print(phone_reg.search(input_str).group(0))

此正则表达式在其中找到5组：2到4位数字，后跟0或1个空格字符

希望这会有所帮助

Answer 2

这是我获取电话号码的方式。实际上，我想要更多示例来验证我的正则表达式。

import re
strs = '''                                                                 
I will be out of the office. Please send me an email and text my mobile +45     
20 32 40 08 if any urgency.                                                     
'''
phone = re.compile("(?<=mobile\s)(.?[0-9]|\s)+", re.S)

print( " ".join(phone.search(strs).group(0).split()) ) # remove \n and space and etc.

从python中的文本中提取电话号码

2 个答案: