从.msg文件中提取.xlsx附件

时间:2019-10-23 14:48:10

标签: python vba attachment data-extraction

我知道这里已经被问过几次了,我已经尝试了一些对其他人有用的方法...我在我的桌面文件夹中存储了1000多个带有.xlsx文件附件的Outlook .msg文件。需要提取.xlsx文件以合并为一个数据帧。

我已经尝试过VBA macro,Python [Win32](Parsing outlook .msg files with python)和msg-extractor。我最好的办法是从单个.msg文件中提取单个附件

任何建议都将不胜感激。谢谢!

import argparse
import csv
import os as os
import pathlib
import sys
from datetime import date, datetime, timedelta, tzinfo
from enum import Enum, IntEnum
from tempfile import mkstemp

import dateutil.parser as duparser
from dateutil.rrule import rrulestr, rruleset
import pywintypes
import pytz
import win32com.client  

path = r'C:\Users\Me\Desktop\MyFiles\feb_2018'
files = [f for f in os.listdir(path) if '.msg' in f]
print (files)
for file in files:
    outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
    msg = outlook.OpenSharedItem(os.path.join(path, file))
    att=msg.Attachments
    for i in att:
        i.SaveAsFile(os.path.join(path, i.FileName))       


2 个答案:

答案 0 :(得分:1)

我没有尝试使用win32com保存附件,所以我不能说出为什么只保存单个文件中的单个附件。但是我可以使用msg-extractor

保存多个附件
import extract_msg

for file in files:
    msg = extract_msg.Message(file)
    msg_attachment = msg.attachments
    attach_path = "path where the files have to be saved."
    for attachment in msg_attachment:
        if not os.path.exists(attach_path):
            os.makedirs(attach_path)
        attachment.save(customPath=attach_path)

答案 1 :(得分:0)

谢谢!我实际上想出了一种通过添加计数器来使用Win32提取多个文件的解决方案:

   path = r'C:\Users\filepath' #change path to directory where your msg files are located
   files = [f for f in os.listdir(path) if '.msg' in f]
   print (files)
   counter=0
   for file in files:
       outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
       msg = outlook.OpenSharedItem(os.path.join(path, file))
       att=msg.Attachments
       for i in att:
           counter +=1
           i.SaveAsFile(os.path.join(path, str(counter)+i.FileName))