Question

我有一个文件，其中包含tiff图像和多部分mime文档中的文档xml。我会从这个文件中提取图像。我怎么能得到它？

我有这个代码，但如果我有一个大文件（例如30Mb），它需要无限的时间来提取它，所以这是无用的。

f=open("content_file.txt","rb")
msg = email.message_from_file(f)
j=0
image=False
for i in msg.walk():
    if i.is_multipart():
        #print "MULTIPART: "
        continue
    if i.get_content_maintype() == 'text':
        j=j+1
        continue
    if i.get_content_maintype() == 'image':
        image=True
        j=j+1
        pl = i.get_payload(decode=True)
        localFile = open("map.out.tiff", 'wb')
        localFile.write(pl)
        continue
        f.close()
    if (image==False):
        sys.exit(0);

非常感谢你。

Answer 1

解决：

def extract_mime_part_matching(stream, mimetype):
"""Return the first element in a multipart MIME message on stream
matching mimetype."""

msg = mimetools.Message(stream)
msgtype = msg.gettype()
params = msg.getplist()

data = StringIO.StringIO()
if msgtype[:10] == "multipart/":

    file = multifile.MultiFile(stream)
    file.push(msg.getparam("boundary"))
    while file.next():
        submsg = mimetools.Message(file)
        try:
            data = StringIO.StringIO()
            mimetools.decode(file, data, submsg.getencoding())
        except ValueError:
            continue
        if submsg.gettype() == mimetype:
            break
    file.pop()
return data.getvalue()

自： http://docs.python.org/release/2.6.6/library/multifile.html

感谢您的支持。

Answer 2

我不太清楚，为什么你的代码会挂起。缩进看起来有点不对，打开的文件没有正确关闭。你的记忆力也可能很低。

此版本适用于我：

import email
import mimetypes

with open('email.txt') as fp:
    message = email.message_from_file(fp)

for i, part in enumerate(message.walk()):
    if part.get_content_maintype() == 'image':
        filename = part.get_filename()
        if not filename:
            ext = mimetypes.guess_extension(part.get_content_type())
            filename = 'image-%02d%s' % (i, ext or '.tiff')
        with open(filename, 'wb') as fp:
            fp.write(part.get_payload(decode=True))

（部分取自http://docs.python.org/library/email-examples.html#email-examples）

使用mime multipart从文件中提取内容

2 个答案: