我正在尝试对CSV文件中存在的电子邮件ID进行基本检查。我不知道为什么不彻底进行“如果”检查。
import csv
import re
input_file = open("test_list.csv", "r").readlines()
print(len(input_file))
csv_reader = csv.reader(input_file)
line_count = 0
try:
for row in csv_reader:
line_count += 1
print('Checking ' + str(line_count) + ' of ' + str(len(input_file)))
name = {row[0]}
email = list({row[2]})
print(str(email[0]))
print('Checking contact name'+str(name))
regex = '^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,})$'
match = re.match(regex,str(email[0]))
if match == None :
print("Bad Email")
else:
print("Good Email")
print('')
print('')
except IndexError as error:
print('Checked all the data')
我的csv文件是这样的:
bhanu1, singh2, bha.nu@gmail.com
bhanu2, singh2, bhadoxit.com
bhanu3, singh2, bhan@esnotexit.com
我的输出是:
3
Checking 1 of 3
bha.nu@gmail.com
Checking contact nameset(['bhanu1'])
Bad Email
Checking 2 of 3
bhadoxit.com
Checking contact nameset(['bhanu2'])
Bad Email
Checking 3 of 3
bhan@esnotexit.com
Checking contact nameset(['bhanu3'])
Bad Email
答案 0 :(得分:1)
您所有的电子邮件地址都以空格开头,因为您没有修剪相邻的空格。
此外,您的代码具有大量处理数据的非常奇怪和and回的方式。这是带有内联注释的重构。
import csv
import re
input_file = open("test_list.csv", "r").readlines()
print(len(input_file))
csv_reader = csv.reader(input_file)
# Compile regex once, use multiple times inside loop
regex = re.compile(
r'^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,})$')
try:
for line_count, row in enumerate(csv_reader, 1):
print('Checking {0} of {1}'.format(line_count, len(input_file)))
# Don't make a set out of this
name = row[0]
# Don't make a list out of this; trim spaces
email = row[2].strip()
print(email)
print('Checking contact name {}'.format(name))
match = regex.match(email)
if match is None:
print("Bad Email")
else:
print("Good Email")
print('')
except IndexError as error:
print('Checked all the data')
try
/ except
处理仍然很奇怪,将文件读入内存,然后然后将其读取为CSV相当笨拙。
答案 1 :(得分:0)
在您的输出中,您可以在电子邮件前面看到一个空格。如果删除它,它应该可以正常工作。只需在您的代码中添加strip()
即可进行匹配。
match = re.match(regex,str(email[0]).strip())