Google Drive Python API:上传大文件

时间:2018-02-26 17:26:43

标签: python google-drive-api

我正在编写一个使用Python API client将文件上传到Google云端硬盘的功能。它适用于最大1 MB的文件,但不适用于10 MB的文件。当我尝试上传10 MB的文件时,出现HTTP 400错误。任何帮助,将不胜感激。感谢。

以下是打印错误时的输出:

An error occurred: <HttpError 400 when requesting https://www.googleapis.com/upload/drive/v3/files?alt=json&uploadType=resumable returned "Bad Request">

这是我打印error.resp:

时的输出

{'server': 'UploadServer', 'status': '400', 'x-guploader-uploadid': '...', 'content-type': 'application/json; charset=UTF-8', 'date': 'Mon, 26 Feb 2018 17:00:12 GMT', 'vary': 'Origin, X-Origin', 'alt-svc': 'hq=":443"; ma=2592000; quic=51303431; quic=51303339; quic=51303338; quic=51303337; quic=51303335,quic=":443"; ma=2592000; v="41,39,38,37,35"', 'content-length': '171'}

我无法解释此错误。我曾尝试查看Google API Error Guide,但他们的解释对我来说没有意义,因为所有参数都与具有较小文件的请求中的参数相同,这些参数都有效。

这是我的代码:

def insert_file_only(service, name, description, filename='', parent_id='root', mime_type=GoogleMimeTypes.PDF):
    """ Insert new file.

    Using documentation from Google Python API as a guide:
        https://developers.google.com/api-client-library/python/guide/media_upload

    Args:
        service: Drive API service instance.
        name: Name of the file to create, including the extension.
        description: Description of the file to insert.
        filename: Filename of the file to insert.
        parent_id: Parent folder's ID.
        mime_type: MIME type of the file to insert.
    Returns:
        Inserted file metadata if successful, None otherwise.
    """

    # Set the file meta data
    file_metadata = set_file_metadata(name, description, mime_type, parent_id)

    # Create media with correct chunk size
    if os.stat(filename).st_size <= 256*1024:
        media = MediaFileUpload(filename, mimetype=mime_type, resumable=True)
    else:
        media = MediaFileUpload(filename, mimetype=mime_type, chunksize=256*1024, resumable=True)

    file = None
    status = None
    start_from_beginning = True
    num_temp_errors = 0

    while file is None:
        try:
            if start_from_beginning:
                # Start from beginning
                logger.debug('Starting file upload')
                file = service.files().create(body=file_metadata, media_body=media).execute()
            else:
                # Upload next chunk
                logger.debug('Uploading next chunk')
                status, file = service.files().create(
                    body=file_metadata, media_body=media).next_chunk()
                if status:
                    logger.info('Uploaded {}%'.format(int(100*status.progress())))

        except errors.HttpError as error:
            logger.error('An error occurred: %s' % error)
            logger.error(error.resp)
            if error.resp.status in [404]:
                # Start the upload all over again
                start_from_beginning = True
            elif error.resp.status in [500, 502, 503, 504]:
                # Increment counter on number of temporary errors
                num_temp_errors += 1
                if num_temp_errors >= NUM_TEMP_ERROR_LIMIT:
                    return None
                # Call next chunk again
            else:
                return None

    permissions = assign_permissions(file, service)
    return file

UPDATE
我尝试使用更简单的模式,接受@StefanE的建议。但是,对于超过1 MB的文件,我仍然会收到HTML 400错误。新代码如下所示:

request = service.files().create(body=file_metadata, media_body=media)
response = None
while response is None:
    status, response = request.next_chunk()
    if status:
        logger.info('Uploaded {}%'.format(int(100*status.progress()))

更新2
我发现问题是将文件转换为Google文档,而不是上传。我正在尝试上传HTML文件并将其转换为Google文档。这适用于小于~2 MB的文件。当我只上传HTML文件但没有尝试转换它时,我没有得到上述错误。看起来这符合此page的限制。我不知道这个限制是否可以增加。

1 个答案:

答案 0 :(得分:1)

I see some issues with your code.

First you have a while loop to continue as long file is None and the first thing you do is to set the value of file. i.e it will only loop once.

Secondly you got variable start_from_beginning but that is never set to False anywhere in the code, the else part of the statement will never be executed.

Looking at the Googles documentation their sample code looks a lot more straight forward:

media = MediaFileUpload('pig.png', mimetype='image/png', resumable=True)
request = farm.animals().insert(media_body=media, body={'name': 'Pig'})
response = None
while response is None:
  status, response = request.next_chunk()
  if status:
    print "Uploaded %d%%." % int(status.progress() * 100)
print "Upload Complete!"

Here you loop on while response is None which will be None until finished with the upload.