Question

我正在尝试对CSV文件中存在的电子邮件ID进行基本检查。我不知道为什么不彻底进行“如果”检查。

import csv
import re
input_file = open("test_list.csv", "r").readlines()
print(len(input_file))
csv_reader = csv.reader(input_file)
line_count = 0
try:
    for row in csv_reader:
        line_count += 1
        print('Checking ' + str(line_count) + ' of ' + str(len(input_file)))
        name = {row[0]}
        email = list({row[2]})
        print(str(email[0]))
        print('Checking contact name'+str(name))
        regex = '^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,})$'
        match = re.match(regex,str(email[0]))
        if match == None :
            print("Bad Email")
        else:
            print("Good Email") 
        print('')
        print('')
except IndexError as error:
    print('Checked all the data')

我的csv文件是这样的：

bhanu1, singh2, bha.nu@gmail.com
bhanu2, singh2, bhadoxit.com
bhanu3, singh2, bhan@esnotexit.com

我的输出是：

3
Checking 1 of 3
 bha.nu@gmail.com
Checking contact nameset(['bhanu1'])
Bad Email

Checking 2 of 3
 bhadoxit.com
Checking contact nameset(['bhanu2'])
Bad Email

Checking 3 of 3
 bhan@esnotexit.com
Checking contact nameset(['bhanu3'])
Bad Email

Answer 1

您所有的电子邮件地址都以空格开头，因为您没有修剪相邻的空格。

此外，您的代码具有大量处理数据的非常奇怪和and回的方式。这是带有内联注释的重构。

import csv
import re

input_file = open("test_list.csv", "r").readlines()
print(len(input_file))

csv_reader = csv.reader(input_file)
# Compile regex once, use multiple times inside loop
regex = re.compile(
    r'^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,})$')
try:
    for line_count, row in enumerate(csv_reader, 1):
        print('Checking {0} of {1}'.format(line_count, len(input_file)))
        # Don't make a set out of this
        name = row[0]
        # Don't make a list out of this; trim spaces
        email = row[2].strip()
        print(email)
        print('Checking contact name {}'.format(name))
        match = regex.match(email)
        if match is None:
            print("Bad Email")
        else:
            print("Good Email") 
        print('')
except IndexError as error:
    print('Checked all the data')

try / except处理仍然很奇怪，将文件读入内存，然后然后将其读取为CSV相当笨拙。

Answer 2

在您的输出中，您可以在电子邮件前面看到一个空格。如果删除它，它应该可以正常工作。只需在您的代码中添加strip()即可进行匹配。

match = re.match(regex,str(email[0]).strip())

正则表达式值与CSV中的电子邮件ID不匹配

2 个答案: