根据搜索词从Google Drive文件夹中检索文件

时间:2019-12-30 05:51:14

标签: python google-drive-api


我已经在Google驱动器中上传了7-8 GB的pdf文件的内容,其中包括pdf,docx,ppt等。我的担心是列出所有包含用户查询的术语的文件。例如,如果我要搜索“使用google drive api的计算机视觉”,则结果应包含包含“计算机视觉”一词的文件列表。

当我在Google云端硬盘搜索框中键入内容并且下面是屏幕截图时,上述情况是可能的。 enter image description here

当我输入机器学习信息时,我会得到文件列表。如何以编程方式检索相同的结果。我已经阅读了Google Drive API的文档,并遇到了句法“全文包含术语”,但是后来我不知道如何使用它。

1 个答案:

答案 0 :(得分:0)

正如您正确地说的那样,一种简单的方法是使用请求的q参数以及fullText contains X运算符。在下面,您可以从使用此功能的参考文献中看到Python Quickstart的改编版本:

from __future__ import print_function
import pickle
import os.path
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request

# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/drive.metadata.readonly']

def main():
    """Shows basic usage of the Drive v3 API.
    Prints the names and ids of the first 10 files the user has access to.
    creds = None
    # The file token.pickle stores the user's access and refresh tokens, and is
    # created automatically when the authorization flow completes for the first
    # time.
    if os.path.exists('token.pickle'):
        with open('token.pickle', 'rb') as token:
            creds = pickle.load(token)
    # If there are no (valid) credentials available, let the user log in.
    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            flow = InstalledAppFlow.from_client_secrets_file(
                'credentials.json', SCOPES)
            creds = flow.run_local_server(port=0)
        # Save the credentials for the next run
        with open('token.pickle', 'wb') as token:
            pickle.dump(creds, token)

    service = build('drive', 'v3', credentials=creds)

    # Call the Drive v3 API
    results = service.files().list(
        pageSize=1000, fields="nextPageToken, files(id, name)", q="fullText contains 'computer vision'").execute()
    items = results.get('files', [])

    if not items:
        print('No files found.')
        for item in items:
            print(u'{0} ({1})'.format(item['name'], item['id']))

if __name__ == '__main__':

