我已经将所有我想请求的邮件过滤到Gmail的标签中,并且通过在他们的quickstart.py脚本中使用以下代码成功地将邮件退回了
# My Code
results = service.users().messages().list(userId='me',labelIds = '{Label_id}', maxResults='10000000').execute()
messages = results.get('messages', [])
for message in messages:
msg = service.users().messages().get(userId='me', id=message['id'], format='metadata', metadataHeaders=['subject']).execute()
print(msg['snippet'].encode('utf-8').strip())
我首先在一个较早的请求中列出了所有标签及其ID,然后将其替换为{Label_id}。然后,我只要求主题元数据字段。问题在于响应仅返回准确的1 Mb数据。我知道这一点是因为我将输出重定向到文件中并执行ls -latr --block-size=MB
。此外,我可以看到该标签中的消息(旧消息)比基于日期返回的消息还多。该请求始终在完全相同的消息处停止。他们都没有附件。
应该允许我使用他们的API参考
Daily Usage 1,000,000,000 quota units per day
Per User Rate Limit 250 quota units per user per second
我不认为这是我要达到的目标,但也许我错了,因为每条消息都有1-3封回复,我可以看到它的来信,也许每封邮件计为5个配额单位?不确定。我尝试过使用maxResults
参数,但这似乎并没有改变任何东西。
我是在这里设置上限吗,还是在我的请求逻辑中?
编辑1
from __future__ import print_function
import pickle
import os.path
import base64
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
## If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://mail.google.com/']
def main():
"""Shows basic usage of the Gmail API.
Lists the user's Gmail labels.
"""
creds = None
# The file token.pickle stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists('token.pickle'):
with open('token.pickle', 'rb') as token:
creds = pickle.load(token)
# If there are no (valid) credentials available, let the user log in.
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'credentials.json', SCOPES)
creds = flow.run_local_server()
# Save the credentials for the next run
with open('token.pickle', 'wb') as token:
pickle.dump(creds, token)
service = build('gmail', 'v1', credentials=creds)
messageArray = []
pageToken = None
while True:
results = service.users().messages().list(userId='me',labelIds = '{Label_ID}', maxResults=500, pageToken=pageToken).execute()
messages = results.get('messages', [])
for message in messages:
msg = service.users().messages().get(userId='me', id=message['id'], format='metadata', metadataHeaders=['subject']).execute()
messageArray.append(msg)
pageToken = results.get('nextPageToken', None)
if not pageToken:
print('[%s]' % ', '.join(map(str, messageArray)))
break
if __name__ == '__main__':
main()
编辑2
这是我使用的最后一个脚本。这会吐出一种更好,更干净的格式,我将其重定向到文件并且易于解析。
from __future__ import print_function
import pickle
import os.path
import base64
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
## If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://mail.google.com/']
def main():
"""Shows basic usage of the Gmail API.
Lists the user's Gmail labels.
"""
creds = None
# The file token.pickle stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists('token.pickle'):
with open('token.pickle', 'rb') as token:
creds = pickle.load(token)
# If there are no (valid) credentials available, let the user log in.
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'credentials.json', SCOPES)
creds = flow.run_local_server()
# Save the credentials for the next run
with open('token.pickle', 'wb') as token:
pickle.dump(creds, token)
service = build('gmail', 'v1', credentials=creds)
pageToken = None
while True:
results = service.users().messages().list(userId='me',labelIds = '{Label_ID}', maxResults=500, pageToken=pageToken).execute()
messages = results.get('messages', [])
for message in messages:
msg = service.users().messages().get(userId='me', id=message['id'], format='metadata', metadataHeaders=['subject']).execute()
print(msg['snippet'].encode('utf-8').strip())
pageToken = results.get('nextPageToken', None)
if not pageToken:
break
if __name__ == '__main__':
main()
答案 0 :(得分:1)
maxResults
的最大值是500。如果将其设置得更高,结果中仍然只会收到500条消息。您可以通过messages
的长支票确认这一点。
您需要实现pagination。
messages = []
pageToken = None
while True:
results = service.users().messages().list(userId='me',labelIds = '{Label_id}', maxResults=500, pageToken=pageToken).execute()
messages.append(results.get(messages, []))
pageToken = results.get('nextPageToken', None)
if not pageToken:
break
如果您只想要未解析的原始电子邮件,请尝试使用
# at top of file
from base64 import urlsafe_b64decode
msg = service.users().messages().get(userId='me', id=message['id'], format='raw').execute()
print(urlsafe_b64decode(msg['raw']))