Question

使用email.header包，我可以

the_text,the_charset = decode_header(inputText)

获取电子邮件标题的字符集，其中inputText由

之类的命令检索

inputText = msg.get('From')

使用From：标头作为示例。

为了提取该标题的标题编码，我是否必须这样做？：

the_header_encoding = email.charset.Charset(the_charset).header_encoding

也就是说，我是否必须根据charset的名称创建一个Charset类的实例（甚至可以工作？），还是有办法从头文件中直接提取头编码？

Answer 1

Encoded-Message标题可以包含1行或更多行，每行可以使用不同的编码，也可以不使用任何编码。

你必须自己解析编码类型，每行一个。使用正则表达式：

import re

quopri_entry = re.compile(r'=\?[\w-]+\?(?P<encoding>[QB])\?[^?]+?\?=', flags=re.I)
encodings = {'Q': 'quoted-printable', 'B': 'base64'}

def encoded_message_codecs(header):
    used = []
    for line in header.splitlines():
        entry = quopri_entry.search(line)
        if not entry:
            used.append(None)
            continue
        used.append(encodings.get(entry.group('encoding').upper(), 'unknown'))
    return used

如果没有使用编码消息，则返回从quoted-printable，base64，unknown或None中提取的字符串列表。

如何确定电子邮件标头是否为base64编码

1 个答案: