如何使用Python以html格式从Drive导出Google文本文档

时间:2018-03-25 14:35:29

标签: python google-api google-drive-api

我需要访问保存在GoogleDrive中的Google文本文档,并以html格式下载其内容。 我想使用我的笔记本电脑终端推出的Python脚本。 谷歌文本文档不公开,因此我必须向服务器授权。

这个question已被指出,但显然解决了javascript用户的问题,它已经有5年了。从那时起,Google API发生了变化。

我尝试了pyDrive模块,但在模块API中没有任何“html格式”选项。 那里的信息很模糊,有很多例子涉及旧的谷歌API,我没有找到一个特定的参考,用Python下载html格式的文本文档。

我经常使用gSpread模块。谷歌文本文档有什么类似的东西吗?

有人能指出我实现这个目标的正确方法吗?

1 个答案:

答案 0 :(得分:0)

最后,我的解决方案:

from __future__ import print_function
import httplib2
import os

import io
from apiclient import discovery
from apiclient.http import MediaIoBaseDownload
from oauth2client import client
from oauth2client import tools
from oauth2client.file import Storage

try:
    import argparse
    flags = argparse.ArgumentParser(parents=[tools.argparser]).parse_args()
except ImportError:
    flags = None

# If modifying these scopes, delete your previously saved credentials
# at ~/.credentials/drive-python-quickstart.json
SCOPES = 'https://www.googleapis.com/auth/drive'
CLIENT_SECRET_FILE = 'client_secret.json'
APPLICATION_NAME = 'Drive API Python Quickstart'

def get_credentials():
    """Gets valid user credentials from storage.

    If nothing has been stored, or if the stored credentials are invalid,
    the OAuth2 flow is completed to obtain the new credentials.

    Returns:
        Credentials, the obtained credential.
    """
    home_dir = os.path.expanduser('~')
    credential_dir = os.path.join(home_dir, '.credentials')
    if not os.path.exists(credential_dir):
        os.makedirs(credential_dir)
    credential_path = os.path.join(credential_dir,
                                   'drive-python-quickstart.json')

    store = Storage(credential_path)
    credentials = store.get()
    if not credentials or credentials.invalid:
        flow = client.flow_from_clientsecrets(CLIENT_SECRET_FILE, SCOPES)
        flow.user_agent = APPLICATION_NAME
        if flags:
            credentials = tools.run_flow(flow, store, flags)
        else: # Needed only for compatibility with Python 2.6
            credentials = tools.run(flow, store)
        print('Storing credentials to ' + credential_path)
    return credentials

def main(docID, myDocPath):
    credentials = get_credentials()
    http = credentials.authorize(httplib2.Http())
    service = discovery.build('drive', 'v3', http=http)
    request = service.files().export_media(fileId=docID,
                                           mimeType='text/html')

    fh = io.BytesIO()
    downloader = MediaIoBaseDownload(fh, request)
    done = False
    while done is False:
        status, done = downloader.next_chunk()
        print("Download %d%%." % int(status.progress() * 100))

    with open(myDocPath, "wb") as f:
        f.write(fh.getvalue())

if __name__ == '__main__':
    myDocID = 'PUT_HERE_YOUR_DOC_ID'
    main(myDocID, 'some.html')