我尝试了不同的python脚本,以便从Gmail下载CSV附件。但我无法得到它。这是可能的。如果可以使用哪个python脚本?谢谢。
答案 0 :(得分:1)
如果您想跳过此答案中的所有详细信息,我整理了一个Github存储库,该存储库使从gmail获取CSV数据的操作非常简单:
from gmail import *
service = get_gmail_service()
# get all attachments from e-mails containing 'test'
search_query = "test"
service = get_gmail_service()
csv_dfs = query_for_csv_attachments(service, search_query)
print(csv_dfs)
README
中的说明进行操作,并尽享乐趣,请随时贡献力量!google-api-python-client
和oauth2client
点击此链接,然后单击按钮:“启用GMAIL API”
https://developers.google.com/gmail/api/quickstart/python
设置完成后,您将下载一个名为credentials.json
安装所需的python软件包
pip install --upgrade google-api-python-client oauth2client
以下代码段将允许您通过python连接到Gmail帐户
from googleapiclient.discovery import build
from httplib2 import Http
from oauth2client import file, client, tools
GMAIL_CREDENTIALS_PATH = 'credentials.json' # downloaded
GMAIL_TOKEN_PATH = 'token.json' # this will be created
store = file.Storage(GMAIL_TOKEN_PATH)
creds = store.get()
if not creds or creds.invalid:
flow = client.flow_from_clientsecrets(GMAIL_CREDENTIALS_PATH, SCOPES)
creds = tools.run_flow(flow, store)
service = build('gmail', 'v1', http=creds.authorize(Http()))
现在,通过此服务,您可以阅读电子邮件并阅读电子邮件中可能包含的任何附件
首先,您可以使用搜索字符串查询电子邮件,以找到需要附件的电子邮件ID:
search_query = "ABCD"
result = service.users().messages().list(userId='me', q=search_query).execute()
msgs = results['messages')
msg_ids = [msg['id'] for msg in msgs]
现在对于每个messageId
,您都可以在电子邮件中找到关联的附件。
这部分有点混乱,请多多包涵。首先我们得到一个清单 电子邮件中的“附件部分”(和附件文件名)。 这些是包含附件的电子邮件的组成部分:
messageId = 'XYZ'
msg = service.messages().get(userId='me', id=messageId).execute()
parts = msg.get('payload').get('parts')
all_parts = []
for p in parts:
if p.get('parts'):
all_parts.extend(p.get('parts'))
else:
all_parts.append(p)
att_parts = [p for p in all_parts if p['mimeType']=='text/csv']
filenames = [p['filename'] for p in att_parts]
现在我们可以从每个部分获取附加的csv:
messageId = 'XYZ'
data = part['body'].get('data')
attachmentId = part['body'].get('attachmentId')
if not data:
att = service.users().messages().attachments().get(
userId='me', id=attachmentId, messageId=messageId).execute()
data = att['data']
现在您有了csv数据,但其格式为编码格式,所以最后我们更改编码并将结果转换为pandas数据帧
import base64
import pandas as pd
from StringIO import StringIO
str_csv = base64.urlsafe_b64decode(data.encode('UTF-8'))
df = pd.read_csv(StringIO(str_csv))
就这样!您有一个带有csv附件内容的pandas数据框。您可以使用此数据框。或者,如果您只想下载csv,则可以使用pd.DataFrame.to_csv
将其写入磁盘。如果您想保留文件名,可以使用我们先前获得的filenames
列表
答案 1 :(得分:1)
在Download attachment from mail using Python上提供了最新答案
import os from imbox import Imbox # pip install imbox import traceback # enable less secure apps on your google account # https://myaccount.google.com/lesssecureapps host = "imap.gmail.com" username = "username" password = 'password' download_folder = "/path/to/download/folder" if not os.path.isdir(download_folder): os.makedirs(download_folder, exist_ok=True) mail = Imbox(host, username=username, password=password, ssl=True, ssl_context=None, starttls=False) messages = mail.messages() # defaults to inbox for (uid, message) in messages: mail.mark_seen(uid) # optional, mark message as read for idx, attachment in enumerate(message.attachments): try: att_fn = attachment.get('filename') download_path = f"{download_folder}/{att_fn}" print(download_path) with open(download_path, "wb") as fp: fp.write(attachment.get('content').read()) except: pass print(traceback.print_exc()) mail.logout() """ Available Message filters: # Gets all messages from the inbox messages = mail.messages() # Unread messages messages = mail.messages(unread=True) # Flagged messages messages = mail.messages(flagged=True) # Un-flagged messages messages = mail.messages(unflagged=True) # Flagged messages messages = mail.messages(flagged=True) # Un-flagged messages messages = mail.messages(unflagged=True) # Messages sent FROM messages = mail.messages(sent_from='sender@example.org') # Messages sent TO messages = mail.messages(sent_to='receiver@example.org') # Messages received before specific date messages = mail.messages(date__lt=datetime.date(2018, 7, 31)) # Messages received after specific date messages = mail.messages(date__gt=datetime.date(2018, 7, 30)) # Messages received on a specific date messages = mail.messages(date__on=datetime.date(2018, 7, 30)) # Messages whose subjects contain a string messages = mail.messages(subject='Christmas') # Messages from a specific folder messages = mail.messages(folder='Social') """
答案 2 :(得分:0)
我明白了。这不是我自己的工作。我得到了一些代码,将它们组合在一起并修改为此代码。然而,最后,它奏效了。
print 'Proceeding'
import email
import getpass
import imaplib
import os
import sys
userName = 'yourgmail@gmail.com'
passwd = 'yourpassword'
directory = '/full/path/to/the/directory'
detach_dir = '.'
if 'DataFiles' not in os.listdir(detach_dir):
os.mkdir('DataFiles')
try:
imapSession = imaplib.IMAP4_SSL('imap.gmail.com')
typ, accountDetails = imapSession.login(userName, passwd)
if typ != 'OK':
print 'Not able to sign in!'
raise
imapSession.select('[Gmail]/All Mail')
typ, data = imapSession.search(None, 'ALL')
if typ != 'OK':
print 'Error searching Inbox.'
raise
for msgId in data[0].split():
typ, messageParts = imapSession.fetch(msgId, '(RFC822)')
if typ != 'OK':
print 'Error fetching mail.'
raise
emailBody = messageParts[0][1]
mail = email.message_from_string(emailBody)
for part in mail.walk():
if part.get_content_maintype() == 'multipart':
continue
if part.get('Content-Disposition') is None:
continue
fileName = part.get_filename()
if bool(fileName):
filePath = os.path.join(detach_dir, 'DataFiles', fileName)
if not os.path.isfile(filePath) :
print fileName
fp = open(filePath, 'wb')
fp.write(part.get_payload(decode=True))
fp.close()
imapSession.close()
imapSession.logout()
print 'Done'
except :
print 'Not able to download all attachments.'
答案 3 :(得分:0)
from imap_tools import MailBox
# get all .csv attachments from INBOX and save them to files
with MailBox('imap.my.ru').login('acc', 'pwd', 'INBOX') as mailbox:
for msg in mailbox.fetch():
for att in msg.attachments:
if att.filename.lower().endswith('.csv'):
with open('C:/1/{}'.format(att.filename), 'wb') as f:
f.write(att.payload)