Python的新功能,希望与Regex一起使用,以处理超过5k的电子邮件地址列表。我需要用双引号更改封装每个地址。我正在使用\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b
来标识每个电子邮件地址。我如何将user@email.com的当前条目替换为“ user@email.com”,在每个5k电子邮件地址周围添加引号?
答案 0 :(得分:2)
您可以使用re.sub模块并使用反向引用,如下所示:
>>> a = "this is email: someone@mail.com and this one is another email foo@bar.com"
>>> re.sub('([A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,})', r'"\1"', a)
'this is email: "someone@mail.com" and this one is another email "foo@bar.com"'
更新:如果您有一个文件想要替换其每一行中的电子邮件,则可以像这样使用readlines()
:
import re
with open("email.txt", "r") as file:
lines = file.readlines()
new_lines = []
for line in lines:
new_lines.append(re.sub('([A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,})', r'"\1"', line))
with open("email-new.txt", "w") as file:
file.writelines(new_lines)
email.txt:
this is test@something.com and another email here foo@bar.com
another email abc@bcd.com
still remaining someone@something.com
email-new.txt(运行代码后):
this is "test@something.com" and another email here "foo@bar.com"
another email "abc@bcd.com"
still remaining "someone@something.com"