我想修改一堆邮件列表。每个邮件列表都包含一个电子邮件地址列表(每行一个),我称之为“旧”地址。对于给定的电子邮件地址,旧的一个在.xlsx文件中与新的一起引用。如果未引用旧地址,则表示它已过时,必须将其删除。有时,邮件列表中的电子邮件地址已经很好了。在这种情况下,它必须保持不变。
我是在python中完成的。我真的没有问题,但我意识到这不是那么明显所以我想分享我的工作。首先是因为它看起来像我已经看过的一些帖子,它可能会有所帮助;第二,因为我的代码绝对没有优化(我不需要优化它,因为在我的情况下它需要0.5s这样)和我很想知道你会做什么来优化我的代码以防万一10 ^ 8个邮件列表。
答案 0 :(得分:0)
这是我最终实现的python代码:
import xlrd
import os
path_old = 'toto'
path_new = 'tata'
mailing_lists = os.listdir(path_old)
good_domain = 'gooddomain.fr'
printing_level = 3
# reading of the excel file
xlsfilename = 'adresses.xlsx'
xlsfile = xlrd.open_workbook(xlsfilename)
number_of_persons = 250
number_column_old_mail = 7
number_column_new_mail = 5
newmail = []
oldmail = []
for count in range(number_of_persons):
oldmail.append(xlsfile.sheets()[0].cell(count,number_column_old_mail).value)
newmail.append(xlsfile.sheets()[0].cell(count,number_column_new_mail).value)
############
for mailinglist_name in mailing_lists:
if printing_level > 0:
print('* dealing with mailing list ',mailinglist_name)
new_mailinglist = []
new_name = mailinglist_name + '_new'
with open(path_old+'/'+mailinglist_name,'r') as inputfile:
for line in inputfile:
if len(line)<2: # to ignore blank lines. This length of 2 is completly arbitrary
continue
line = line.rstrip('\n')
ok = False
# case 1: the address inside the old mailing list is ok ==> copied in the new mailing list
if '@' in line:
if line[line.index('@')+1:] == good_domain:
new_mailinglist.append(line)
if printing_level > 1:
print(' --> address ',line,' already ok ==> kept unmodified')
ok = True
# case 2: the address inside the old mailing list is not ok ==> must be treated
if not ok:
if printing_level > 1:
print(' --> old address ',line,' must be treated')
try:
# case 2a: the old address is in the excel file ==> replaced
ind = oldmail.index(line)
if printing_level > 2:
print(' --> old address found in the excel file and replaced by ',newmail[ind])
new_mailinglist.append(newmail[ind])
except ValueError:
# case 2b: the old address is obsolete ==> removed
if printing_level > 2:
print(' --> old address removed')
with open(path_new+'/'+new_name,'w') as outputfile:
for address in new_mailinglist:
outputfile.write(address+'\n')