我已从我的Gmail帐户下载了邮件存档。我使用以下从博客中获取的python(2.7)代码将存档的内容转换为csv。
XX
X X
X X
X X
XX
XX
X X
X X
X X
X X
XX
我想要包含邮件正文(实际邮件)......但无法弄清楚如何。我之前没有使用过python,有人可以帮忙吗?我已经使用了其他SO选项,但无法通过。
要执行相同的任务,我也使用了以下代码:但是第60行会出现缩进错误:return json_msg。我尝试了不同的缩进选项,但没有任何改进。
import mailbox
import csv
writer = csv.writer(open(("clean_mail.csv", "wb"))
for message in mailbox.mbox('archive.mbox'):
writer.writerow([message['subject'], message['from'], message['date']])
答案 0 :(得分:2)
试试这个。
import mailbox
import csv
writer = csv.writer(open(("clean_mail.csv", "wb"))
for message in mailbox.mbox('archive.mbox'):
if message.is_multipart():
content = ''.join(part.get_payload() for part in message.get_payload())
else:
content = message.get_payload()
writer.writerow([message['subject'], message['from'], message['date'],content])
或者这个:
import mailbox
import csv
def get_message(message):
if not message.is_multipart():
return message.get_payload()
contents = ""
for msg in message.get_payload():
contents = contents + str(msg.get_payload()) + '\n'
return contents
if __name__ == "__main__":
writer = csv.writer(open("clean_mail.csv", "wb"))
for message in mailbox.mbox("archive.mbox"):
contents = get_message(message)
writer.writerow([message["subject"], message["from"], message["date"],contents])
查找文档here。
答案 1 :(得分:0)
Rahul片段对多部分内容的一点改进:
import sys
import mailbox
import csv
from email.header import decode_header
infile = sys.argv[1]
outfile = sys.argv[2]
writer = csv.writer(open(outfile, "w"))
def get_content(part):
content = ''
payload = part.get_payload()
if isinstance(payload, str):
content += payload
else:
for part in payload:
content += get_content(part)
return content
writer.writerow(['date', 'from', 'to', 'subject', 'content'])
for index, message in enumerate(mailbox.mbox(infile)):
content = get_content(message)
row = [
message['date'],
message['from'].strip('>').split('<')[-1],
message['to'],
decode_header(message['subject'])[0][0],
content
]
writer.writerow(row)