我使用regext从str中查找所有电子邮件地址,但是,有时电子邮件地址被剪切,仅返回一个地址的一部分。
import re
regex=r'(\w{1,}((\.|_|-|\w)[\w]){0,}@\w{1,}((\.|_|-|\w)[\w]){0,}\.\w{1,})'
str2fetch='''
wwwr@h.com.h.ki.l =》1 #==》wwwr@h.com.h.ki
sdfsd
2@mail2.4.sdu.edu.cn.u.163.com #=>2@mail2.4.sdu.edu.cn
0@0.0
1@1.1.1
1@123434.22222.333.4444.com
AAAAAA2@p.2-t.2.3o.2.abcd4 #=>aaaaaa2@p.2-t.2.3o
AAAAAA2@p.2t.2.3o.2.abcd4 #=>aaaaaa2@p.2t
AAAAAA2@p.2-t.2p.3o.2.abcd4 #=>aaaaaa2@p.2-t.2p
DAAAAAA2@p.2p-t.2.3o.2.abcd4 #=>daaaaaa2@p.2p
3@3.3.3.3.3
4@4.4.4.4.4.4
'''
emailList=list(set(re.findall(regex,str2fetch.lower())))
print(emailList)
左边是假定的结果,但是给出了右边的结果。
wwwr@h.com.h.ki.l =》1 #==》wwwr@h.com.h.ki
AAAAAA2@p.2-t.2.3o.2.abcd4 #=>aaaaaa2@p.2-t.2.3o
AAAAAA2@p.2t.2.3o.2.abcd4 #=>aaaaaa2@p.2t
AAAAAA2@p.2-t.2p.3o.2.abcd4 #=>aaaaaa2@p.2-t.2p
DAAAAAA2@p.2p-t.2.3o.2.abcd4 #=>daaaaaa2@p.2p
答案 0 :(得分:0)
regex = r'(\w{1,}((\.|_|-|\w)[\w]){0,}@\w{1,}((\.|_|-|\w)[\w]){0,}\.\w{1,})'
a='\w{1,}' #any digital or alphabet with length >1
b='((\.|_|-|\w)[\w]){0,}' #started by dot|_|-|any digital/alphabet str, +any digital/alphabet str
c='@'
d='\w{1,}' #any digital or alphabet with length >1
e='((\.|_|-|\w)[\w]){0,}' #started by dot|_|-|any digital/alphabet str, +any digital/alphabet str
f='\.\w{1,}' #dot+any digital or alphabet with length >1