我的目标是使用Google API从我指定的电子邮件中获取数据。目前,我可以找到消息,获取消息数据并将消息数据解码为可读格式。在此之后,我需要找到邮件的正确部分(键入text/html
),然后使用beautiful soup扫描我的链接。很遗憾,我不太了解电子邮件/ Google API的结构,无法扫描邮件的这一特定部分。
try:
message = gmail_service.users().messages().get(userId='me', id=thread['id'], format='raw').execute()
print 'Message snippet: %s' % message['snippet']
msg_str = base64.urlsafe_b64decode(message['raw'].encode('ASCII'))
mime_msg = email.message_from_string(msg_str)
print mime_msg #this line gives the output I quoted
for parts in mime_msg['payload']: #this line produces error quoted
if parts['text/html']:
mylink = base64.urlsafe_b64decode(part[0]['body']['data'].encode('UTF-8'))
print mylink
此代码给我的错误是:
Traceback (most recent call last):
File "gmailAPI.py", line 55, in <module>
for parts in mime_msg['payload']:
TypeError: 'NoneType' object is not iterable
在代码的输出中,我还收到有关邮件不同部分的信息,这是我想要的部分:
----boundary_1_81681de2-2c9a-4827-802a-91544e5e6e28
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: base64
PCFET0NUWVBFIGh0bWwgUFVCTElDICItLy9XM0MvL0RURCBIVE1MIDQuMDEgVHJhbnNpdGlvbmFsLy9FTiINCiAgICJodHRwOi8vd3d3LnczLm9yZy9UUi9odG1sNC9sb29zZS5kdGQiPg0KDQo8aHRtbCBsYW5nPSJlbiI+DQo8aGVhZD4NCgk8bWV0YSBodHRwLWVxdWl2PSJDb250ZW50LVR5cGUiIGNvbnRlbnQ9InRleHQvaHRtbDsgY2hhcnNldD11dGYtOCI+DQoJPHRpdGxlPlNpZ251cDwvdGl0bGU+DQo8L2hlYWQ+DQoNCjxib2R5IGJnY29sb3I9IiNmZmZmZmYiIHRvcG1hcmdpbj0iMCIgbGVmdG1hcmdpbj0iMCIgbWFyZ2luaGVpZ2h0PSIwIiBtYXJnaW53aWR0aD0iMCIgc3R5bGU9Ii13ZWJraXQtZm9udC1zbW9vdGhpbmc6IGFudGlhbGlhc2VkO3dpZHRoOjEwMCUgIWltcG9ydGFudDtiYWNrZ3JvdW5kOiNmZmZmZmY7LXdlYmtpdC10ZXh0LXNpemUtYWRqdXN0Om5vbmU7Ij4NCg0KPHRhYmxlIHdpZHRoPSIxMDAlIiBjZWxscGFkZGluZz0iMCIgY2VsbHNwYWNpbmc9IjAiIGJvcmRlcj0iMCIgYmdjb2xvcj0iI2ZmZmZmZiI+DQoJPHRyPg0KCQk8dGQgYmdjb2xvcj0iI2ZmZmZmZiIgd2lkdGg9IjEwMCUiPg0KCQkJPHRhYmxlIHdpZHRoPSI2MDAiIGNlbGxwYWRkaW5nPSIwIiBjZWxsc3BhY2luZz0iMCIgYm9yZGVyPSIwIiBhbGlnbj0iY2VudGVyIiBjbGFzcz0i
dGFibGUiPg0KCQkJCTx0cj4NCgkJCQkJPHRkIHdpZHRoPSI2MDAiIGNsYXNzPSJjZWxsIj4NCgkgICAJCQkJCTx0YWJsZSB3aWR0aD0iNjAwIiBjZWxscGFkZGluZz0iMCIgY2VsbHNwYWNpbmc9IjAiIGNsYXNzPSJtYXN0Ij4NCgkJCQkJCQk8dHI+DQoJCQkJCQkJCTx0ZCB3aWR0aD0iMjUwIiBiZ2NvbG9yPSIjZmZmZmZmIj4NCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIDxpbWcgc3JjPSJjaWQ6QlRfTG9nby5qcGciIGFsdD0iQm90dG9tbGluZSBsb2dvIiBzdHlsZT0iLW1zLWludGVycG9sYXRpb24tbW9kZTpiaWN1YmljOyI+PGJyLz48YnIgLz4NCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgPC90ZD4NCiAgICAgICAgICAgICAgICAgICAgICAgICAgIAk8L3RyPg0KICAgICAgICAgICAgICAgICAgICAgICAgICAgIDx0cj4NCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgPHRkIGFsaWduPSJsZWZ0IiB3aWR0aD0iMzUwIiBzdHlsZT0icGFkZGluZy1ib3R0b206IDE1cHg7IiB2YWxpZ249InRvcCIgYmdjb2xvcj0iI2ZmZmZmZiIgY2xhc3M9InN1YkxvZ28iPjxpbWcgc3JjPSJjaWQ6QlRfTGluZS5qcGciIGFsdD0ibGluZSI+PC90ZD4NCiAgICAgICAgICAgICAgICAgICAgICAgICAgICA8L3RyPg0KCQkJCQkJPC90YWJsZT4JDQogICAgICAgICAgICAgICAgICAgICAgICA8dGFibGUgd2lkdGg9IjEwMCUiIGNlbGxwYWRkaW5nPSIwIiBjZWxsc3BhY2luZz0iMCIgYm9yZGVyPSIwIj4NCiAgICAgICAgICAgICAgICAgICAgICAgICAgICA8dHI+DQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIDx0ZCBiZ2NvbG9yPSIjZmZmZmZmIiBzdHlsZT0icGFkZGluZzogMjBweDsiIGNsYXNzPSJlbnRyeSIgdmFsaWduPSJ0b3AiPg0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICA8c3BhbiBzdHlsZT0iY29sb3I6IzMzMzMzMztmb250LXNpemU6MTRweDtsaW5lLWhlaWdodDoxLjI7Zm9udC1mYW1pbHk6J0hlbHZldGljYSBOZXVlJyxIZWx2ZXRpY2EsQXJpYWwsc2Fucy1zZXJpZjttYXJnaW4tYm90dG9tOjA7cGFkZGluZy10b3A6MDtwYWRkaW5nLWJvdHRvbTowO2ZvbnQtd2VpZ2h0Om5vcm1hbDsiPg0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgPGJyLz5UaGFuayB5b3UgZm9yIGNob29zaW5nIGVDb25uZWN0IE9ubGluZSBmcm9tIEJvdHRvbWxpbmUgVGVjaG5vbG9naWVzOyB5b3VyIHNlY3VyZSBjbG91ZCBkb2N1bWVudCBkZWxpdmVyeSBzZXJ2aWNlLiBUbyBjb21wbGV0ZSB0aGUgIHNldHVwIG9mIHlvdXIgYWNjb3VudCwgcGxlYXNlIGZvbGxvdyB0aGUgbGluayBiZWxvdy48YnIgLz4NCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIDwvc3Bhbj48YnIgLz48YnIgLz48YnIgLz4NCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgPGEgc3R5bGU9ImJhY2tncm91bmQtY29sb3I6IzNlN2Q2NTt0ZXh0LWRlY29yYXRpb246bm9uZTsgZm9udC1mYW1pbHk6J0hlbHZldGljYSBOZXVlJyxIZWx2ZXRpY2EsQXJpYWwsc2Fucy1zZXJpZjsgY29sb3I6I2ZmZmZmZjsgcGFkZGluZy10b3A6OHB4OyBwYWRkaW5nLWJvdHRvbTo4cHg7IHBhZGRpbmctbGVmdDo4cHg7IHBhZGRpbmctcmlnaHQ6OHB4OyBmb250LXNpemU6MThweDsgbWFyZ2luOiA4cHg7IiBocmVmPSJodHRwOi8vZWNvbm5lY3QuZW1lYS1ib3R0b21saW5lLnJvb3QuYm90dG9tbGluZS5jb20vYXBpL2FjY291bnQvc2lnbnVwY29tcGxldGUvMjBhNTE4YjktZGIzZS00OTkzLWFjN2UtYjE0YzZjMGVkMzMzIj48c3BhbiBzdHlsZT0iY29sb3I6I2ZmZmZmZiI+Q29tcGxldGUgQWNjb3VudCBTZXR1cCAmcmFxdW87PC9zcGFuPjwvYT48YnIgLz48YnIgLz48YnIgLz4NCg0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgPHNwYW4gc3R5bGU9ImNvbG9yOiMzMzMzMzM7Zm9udC1mYW1pbHk6J0hlbHZldGljYSBOZXVlJyxIZWx2ZXRpY2EsQXJpYWwsc2Fucy1zZXJpZjtmb250LXNpemU6MTRweDtsaW5lLWhlaWdodDoxLjI7Zm9u
dC1mYW1pbHk6J0hlbHZldGljYSBOZXVlJyxIZWx2ZXRpY2EsQXJpYWwsc2Fucy1zZXJpZjttYXJnaW4tYm90dG9tOjA7cGFkZGluZy10b3A6MDtwYWRkaW5nLWJvdHRvbTowO2ZvbnQtd2VpZ2h0Om5vcm1hbDsiPg0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgPGJyLz5LaW5kIFJlZ2FyZHMsPGJyIC8+DQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgVGhlIGVDb25uZWN0IE9ubGluZSBUZWFtDQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICA8L3NwYW4+PGJyIC8+PGJyIC8+DQoNCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIDxzcGFuIHN0eWxlPSJjb2xvcjojMzMzMzMzO2ZvbnQtc2l6ZToxNHB4O2xpbmUtaGVpZ2h0OjEuMjtmb250LWZhbWlseTonSGVsdmV0aWNhIE5ldWUnLEhlbHZldGljYSxBcmlhbCxzYW5zLXNlcmlmO21hcmdpbi1ib3R0b206MDtwYWRkaW5nLXRvcDowO3BhZGRpbmctYm90dG9tOjA7Zm9udC13ZWlnaHQ6bm9ybWFsOyI+DQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICA8YnIvPkZvciBzdXBwb3J0IHBsZWFzZSBjb250YWN0OiBlbWVhLXN1cHBvcnRAYm90dG9tbGluZS5jb20gPGJyIC8+DQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgVGVsOiAwODcwIDA4MSA4MjUwPGJyIC8+DQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICA8L3NwYW4+DQoNCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgPC90ZD4NCiAgICAgICAgICAgICAgICAgICAgICAgICAgICA8L3RyPg0KICAgICAgICAgICAgICAgICAgICAgICAgPC90YWJsZT4NCiAgICAgICAgICAgICAgICAgICAgICAgIDxici8+DQoJCQkJCQkJCQkNCiAgICAgICAgICAgICAgICAgICAgPC90ZD4NCiAgICAgICAgICAgICAgICA8L3RyPg0KICAgICAgICAgICAgPC90YWJsZT4NCgkJPC90ZD4NCgk8L3RyPg0KPC90YWJsZT4NCgkJCQkNCjx0YWJsZSB3aWR0aD0iNjAwIiBjZWxscGFkZGluZz0iMCIgY2VsbHNwYWNpbmc9IjAiIGJvcmRlcj0iMCIgYWxpZ249ImNlbnRlciIgY2xhc3M9ImZvb3RlciI+DQogICAgPHRyPg0KICAgICAgICA8dGQ+DQogICAgICAgICAgICA8dGFibGUgd2lkdGg9IjYwMCIgY2VsbHBhZGRpbmc9IjAiIGNlbGxzcGFjaW5nPSIwIiBib3JkZXI9IjAiIGFsaWduPSJjZW50ZXIiIGNsYXNzPSJ0YWJsZSIgc3R5bGU9ImJvcmRlci10b3A6MXB4IHNvbGlkICNjY2NjY2M7Ij4NCiAgICAgICAgICAgICAgICA8dHI+DQogICAgICAgICAgICAgICAgICAgIDx0ZD4NCiAgICAgICAgICAgICAgICAgICAgICAgIDwvdGQ+DQogICAgICAgICAgICAgICAgPC90cj4NCiAgICAgICAgICAgICAgICA8dHI+DQogICAgICAgICAgICAgICAgICAgIDx0ZD48cCBzdHlsZT0iZm9udC1mYW1pbHk6dmVyZGFuYTsgY29sb3I6IzQ0NDQ0NDsgZm9udC1zaXplOjEwcHg7Ij4NCiAgICAgICAgICAgICAgICAgICAgICAgICAgICZjb3B5OyAyMDE0IEJvdHRvbWxpbmUgVGVjaG5vbG9naWVzLCBJbmMuIEFsbCBSaWdodHMgUmVzZXJ2ZWQ8L3A+PC90ZD4NCiAgICAgICAgICAgICAgICA8L3RyPgkNCiAgICAgICAgICAgIDwvdGFibGU+ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICANCiAgICAgICAgPC90ZD4NCiAgICA8L3RyPg0KPC90YWJsZT4NCgkNCjwvYm9keT4NCjwvaHRtbD4NCg==
Link to full dump from my code
修改:我的固定代码
try:
message = gmail_service.users().messages().get(userId='me', id=thread['id'], format='raw').execute()
# print 'Message snippet: %s' % message['snippet']
msg_str = base64.urlsafe_b64decode(message['raw'].encode('ASCII'))
msg = email.message_from_string(msg_str)
for part in msg.walk():
msg.get_payload()
if part.get_content_type() == 'text/html':
mytext = base64.urlsafe_b64decode(part.get_payload().encode('UTF-8'))
# print part.get_payload()
print mytext
在我选择的答案文档链接中找到的信息在解决我的问题时非常宝贵!
答案 0 :(得分:3)
要在Python中迭代多部分消息的各个部分,您应该使用get_payload()
:https://docs.python.org/2/library/email.message.html#email.message.Message.get_payload
在您的示例中,对mime_msg['payload']
的调用正在查找名为“payload”的邮件头,该邮件头不存在且无论如何都不是您想要的。
掌握了一部分后,您可以使用part['Content-Type']
检查其类型,以检查Content-Type标题。
通常,MIME消息是部分树,因此您可能需要递归。