使用正则表达式查找和替换电子邮件地址

时间:2019-03-26 20:06:24

标签: python

Python的新功能,希望与Regex一起使用,以处理超过5k的电子邮件地址列表。我需要用双引号更改封装每个地址。我正在使用\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b来标识每个电子邮件地址。我如何将user@email.com的当前条目替换为“ user@email.com”,在每个5k电子邮件地址周围添加引号?

1 个答案:

答案 0 :(得分:2)

您可以使用re.sub模块并使用反向引用,如下所示:

>>> a = "this is email: someone@mail.com and this one is another email foo@bar.com"
>>> re.sub('([A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,})', r'"\1"', a)

'this is email: "someone@mail.com" and this one is another email "foo@bar.com"'

更新:如果您有一个文件想要替换其每一行中的电子邮件,则可以像这样使用readlines()

import re

with open("email.txt", "r") as file:
    lines = file.readlines()

new_lines = []
for line in lines:
    new_lines.append(re.sub('([A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,})', r'"\1"', line))

with open("email-new.txt", "w") as file:
    file.writelines(new_lines)

email.txt:

this is test@something.com and another email here foo@bar.com
another email abc@bcd.com
still remaining someone@something.com

email-new.txt(运行代码后):

this is "test@something.com" and another email here "foo@bar.com"
another email "abc@bcd.com"
still remaining "someone@something.com"