我正在努力从电子邮件中获取特殊字符以正确显示。
我通过Gmail API收到邮件,如下所示:
msg_id = '169a8fac44fd8115'
service = build('gmail', 'v1', credentials=creds)
message = service.users().messages().get(userId='me', id=msg_id).execute()
htmlpart = message['payload']['parts'][0]['parts'][1]['body']['data']
然后,我尝试了以下操作:
file_data = quopri.decodestring(base64.urlsafe_b64decode(htmlpart)).decode('iso-8859-1')
file_data = base64.urlsafe_b64decode(htmlpart.encode('UTF-8')).decode('iso-8859-1')
file_data = base64.urlsafe_b64decode(htmlpart.encode('iso-8859-1')).decode('utf-8')
file_data = base64.urlsafe_b64decode(htmlpart.encode('UTF-8')).decode('utf-8')
没有人能给我正确的输出。相反,我得到的是类似€2
而不是€
的东西。
作为参考,此消息的标题如下:
'headers': [{'name': 'Content-Type', 'value': 'text/html; charset="UTF-8"'},
{'name': 'Content-Transfer-Encoding', 'value': 'quoted-printable'}]
编辑:在下面添加了示例数据。我正在尝试获取电子邮件的html,我仅在下面复制其中的一部分,以突出显示编码问题(You'll get
)。
</tr><tr><td class="m_4364729876101169671Uber18_text_p1" align="left" style="color:rgb(0,0,0);font-family:'Uber18-text-Regular','HelveticaNeue-Light','Helvetica Neue Light',Helvetica,Arial,sans-serif;font-size:16px;line-height:28px;direction:ltr;text-align:left"> Give friends free ride credit to try Uber. You'll get CN¥10 off each of your next 3 rides when they start riding. <span class="m_4364729876101169671Uber18_text_p1" style="color:#000000;font-family:'Uber18-text-Regular','HelveticaNeue-Light','Helvetica Neue Light',Helvetica,Arial,sans-serif;font-size:16px;line-height:28px">Share code: 20ccv</span></td>
答案 0 :(得分:2)
标题
'headers': [{'name': 'Content-Type', 'value': 'text/html; charset="UTF-8"'},
{'name': 'Content-Transfer-Encoding', 'value': 'quoted-printable'}]
告诉您该消息由编码为UTF-8的文本组成,然后用带引号的可打印格式编码,以便可以由仅支持7位字符的系统处理。
要解码,您需要先从quoted-printable进行解码,然后再从UTF-8解码结果字节。
类似的事情应该起作用:
utf8 = quopri.decodestring(htmlpart)
text = ut8.decode('utf-8')
HTML电子邮件正文可能包含character entities。可以使用html.unescape(在Python 3.4+中可用)将这些字符转换为单个字符。
>>> import html
>>> h = """</tr><tr><td class="m_4364729876101169671Uber18_text_p1" align="left" style="color:rgb(0,0,0);font-family:'Uber18-text-Regular','HelveticaNeue-Light','Helvetica Neue Light',Helvetica,Arial,sans-serif;font-size:16px;line-height:28px;direction:ltr;text-align:left"> Give friends free ride credit to try Uber. You'll get CN¥10 off each of your next 3 rides when they start riding. <span class="m_4364729876101169671Uber18_text_p1" style="color:#000000;font-family:'Uber18-text-Regular','HelveticaNeue-Light','Helvetica Neue Light',Helvetica,Arial,sans-serif;font-size:16px;line-height:28px">Share code: 20ccv</span></td>"""
>>> print(html.unescape(h))
</tr><tr><td class="m_4364729876101169671Uber18_text_p1" align="left" style="color:rgb(0,0,0);font-family:'Uber18-text-Regular','HelveticaNeue-Light','Helvetica Neue Light',Helvetica,Arial,sans-serif;font-size:16px;line-height:28px;direction:ltr;text-align:left"> Give friends free ride credit to try Uber. You'll get CN¥10 off each of your next 3 rides when they start riding. <span class="m_4364729876101169671Uber18_text_p1" style="color:#000000;font-family:'Uber18-text-Regular','HelveticaNeue-Light','Helvetica Neue Light',Helvetica,Arial,sans-serif;font-size:16px;line-height:28px">Share code: 20ccv</span></td>