根据excel文件重新格式化某些邮件列表

时间:2017-08-30 09:14:17

标签: python excel email

我想修改一堆邮件列表。每个邮件列表都包含一个电子邮件地址列表(每行一个),我称之为“旧”地址。对于给定的电子邮件地址,旧的一个在.xlsx文件中与新的一起引用。如果未引用旧地址,则表示它已过时,必须将其删除。有时,邮件列表中的电子邮件地址已经很好了。在这种情况下,它必须保持不变。

我是在python中完成的。我真的没有问题,但我意识到这不是那么明显所以我想分享我的工作。首先是因为它看起来像我已经看过的一些帖子,它可能会有所帮助;第二,因为我的代码绝对没有优化(我不需要优化它,因为在我的情况下它需要0.5s这样)和我很想知道你会做什么来优化我的代码以防万一10 ^ 8个邮件列表。

1 个答案:

答案 0 :(得分:0)

这是我最终实现的python代码:

import xlrd
import os
path_old = 'toto'
path_new = 'tata'
mailing_lists = os.listdir(path_old)
good_domain = 'gooddomain.fr'
printing_level = 3

# reading of the excel file
xlsfilename = 'adresses.xlsx'
xlsfile = xlrd.open_workbook(xlsfilename)
number_of_persons = 250
number_column_old_mail = 7
number_column_new_mail = 5
newmail = []
oldmail = []
for count in range(number_of_persons):
    oldmail.append(xlsfile.sheets()[0].cell(count,number_column_old_mail).value)
    newmail.append(xlsfile.sheets()[0].cell(count,number_column_new_mail).value)
############

for mailinglist_name in mailing_lists:
    if printing_level > 0:
        print('* dealing with mailing list ',mailinglist_name)
    new_mailinglist = []
    new_name = mailinglist_name + '_new'

    with open(path_old+'/'+mailinglist_name,'r') as inputfile:
        for line in inputfile:
            if len(line)<2: # to ignore blank lines. This length of 2 is completly arbitrary
                continue
            line = line.rstrip('\n')
            ok = False

# case 1: the address inside the old mailing list is ok ==> copied in the new mailing list
            if '@' in line:
                if line[line.index('@')+1:] == good_domain:
                    new_mailinglist.append(line)
                    if printing_level > 1:
                        print(' --> address ',line,' already ok ==> kept unmodified')
                    ok = True

# case 2: the address inside the old mailing list is not ok ==> must be treated
            if not ok:
                if printing_level > 1:
                        print(' --> old address ',line,' must be treated')
                try:
# case 2a: the old address is in the excel file ==> replaced
                    ind = oldmail.index(line)
                    if printing_level > 2:
                        print('  --> old address found in the excel file and replaced by ',newmail[ind])
                    new_mailinglist.append(newmail[ind])
                except ValueError:
# case 2b: the old address is obsolete ==> removed
                    if printing_level > 2:
                        print('  --> old address removed')

    with open(path_new+'/'+new_name,'w') as outputfile:
        for address in new_mailinglist:
            outputfile.write(address+'\n')