将打印输出保存为.txt

时间:2016-12-02 10:16:19

标签: python regex output

我有一个脚本可以从.txt文档导出所有电子邮件地址并打印所有电子邮件地址。 我想将此保存到list.txt,如果可能的话删除重复项, 但它会给出错误

Traceback (most recent call last):
  File "mail.py", line 44, in <module>
    notepad.write(email.read())
AttributeError: 'str' object has no attribute 'read'

脚本:

from optparse import OptionParser
import os.path
import re

regex = re.compile(("([a-z0-9!#$%&'*+\/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+\/=?^_`"
                    "{|}~-]+)*(@|\sat\s)(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?(\.|"
                    "\sdot\s))+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?)"))

def file_to_str(filename):
    """Returns the contents of filename as a string."""
    with open(filename) as f:
        return f.read().lower() # Case is lowered to prevent regex mismatches.

def get_emails(s):
    """Returns an iterator of matched emails found in string s."""
    # Removing lines that start with '//' because the regular expression
    # mistakenly matches patterns like 'http://foo@bar.com' as '//foo@bar.com'.
    return (email[0] for email in re.findall(regex, s) if not     email[0].startswith('//'))

if __name__ == '__main__':
    parser = OptionParser(usage="Usage: python %prog [FILE]...")
    # No options added yet. Add them here if you ever need them.
    options, args = parser.parse_args()

    if not args:
        parser.print_usage()
        exit(1)

    for arg in args:
        if os.path.isfile(arg):
            for email in get_emails(file_to_str(arg)):
                #print email
                notepad = open("list.txt","wb")
                notepad.write(email.read())
                notepad.close()

        else:
            print '"{}" is not a file.'.format(arg)
            parser.print_usage()

1 个答案:

答案 0 :(得分:0)

  

当我删除.read()时,它在list.txt中只显示1个电子邮件地址   使用打印电子邮件显示几百。当刷新时   list.txt在提取繁忙的时候,电子邮件adres改变了但是它   只显示1。

这是因为你在循环中有open()close(),i。即每个email重新写入文件,最后只写入最后一个地址行。将循环更改为:

            notepad = open("list.txt", "wb")
            for email in get_emails(file_to_str(arg)):
                #print email
                notepad.write(email)
            notepad.close()

甚至更好:

            with open("list.txt", "wb") as notepad:
              for email in get_emails(file_to_str(arg)):
                #print email
                notepad.write(email)