使用Python将邮箱发送到csv

时间:2015-11-05 05:43:07

标签: python-2.7 parsing csv gmail

我已从我的Gmail帐户下载了邮件存档。我使用以下从博客中获取的python(2.7)代码将存档的内容转换为csv。

  XX
 X  X
X    X
 X  X
  XX

  XX
 X  X
X    X
X    X
 X  X
  XX

我想要包含邮件正文(实际邮件)......但无法弄清楚如何。我之前没有使用过python,有人可以帮忙吗?我已经使用了其他SO选项,但无法通过。

要执行相同的任务,我也使用了以下代码:但是第60行会出现缩进错误:return json_msg。我尝试了不同的缩进选项,但没有任何改进。

import mailbox
import csv
writer = csv.writer(open(("clean_mail.csv", "wb"))
for message in mailbox.mbox('archive.mbox'):
    writer.writerow([message['subject'], message['from'], message['date']])

2 个答案:

答案 0 :(得分:2)

试试这个。

import mailbox
import csv
writer = csv.writer(open(("clean_mail.csv", "wb"))
for message in mailbox.mbox('archive.mbox'):
    if message.is_multipart():
        content = ''.join(part.get_payload() for part in message.get_payload())
    else:
        content = message.get_payload()
    writer.writerow([message['subject'], message['from'], message['date'],content])

或者这个:

import mailbox
import csv

def get_message(message):
    if not message.is_multipart():
        return message.get_payload()
    contents = ""
    for msg in message.get_payload():
        contents = contents + str(msg.get_payload()) + '\n'
    return contents

if __name__ == "__main__":

    writer = csv.writer(open("clean_mail.csv", "wb"))
    for message in mailbox.mbox("archive.mbox"):
        contents = get_message(message)
        writer.writerow([message["subject"], message["from"], message["date"],contents])

查找文档here

答案 1 :(得分:0)

Rahul片段对多部分内容的一点改进:

import sys
import mailbox
import csv
from email.header import decode_header

infile = sys.argv[1]
outfile = sys.argv[2]
writer = csv.writer(open(outfile, "w"))


def get_content(part):
    content = ''
    payload = part.get_payload()
    if isinstance(payload, str):
        content += payload
    else:
        for part in payload:
            content += get_content(part)
    return content


writer.writerow(['date', 'from', 'to', 'subject', 'content'])
for index, message in enumerate(mailbox.mbox(infile)):
    content = get_content(message)
    row = [
        message['date'],
        message['from'].strip('>').split('<')[-1],
        message['to'],
        decode_header(message['subject'])[0][0],
        content
    ]
    writer.writerow(row)