如何在上传过程中不将整个文件加载到内存中

时间:2014-11-04 16:05:26

标签: python bottle

使用Bottle创建上传API。下面的脚本能够将文件上传到目录,但有两个问题我需要解决。一个是如何避免将整个文件加载到内存中另一个是如何设置上传文件的最大大小?

是否可以连续读取文件并将已读取的内容转储到文件,直到上传完成? upload.save(file_path, overwrite=False, chunk_size=1024)函数似乎将整个文件加载到内存中。在本教程中,他们指出使用.read() is dangerous

from bottle import Bottle, request, run, response, route, default_app, static_file
app = Bottle()

@route('/upload', method='POST')
def upload_file():
    function_name = sys._getframe().f_code.co_name
    try:
        upload = request.files.get("upload_file")
        if not upload:
            return "Nothing to upload"
        else:
            #Get file_name and the extension
            file_name, ext = os.path.splitext(upload.filename)
            if ext in ('.exe', '.msi', '.py'):
                return "File extension not allowed."

            #Determine folder to save the upload
            save_folder = "/tmp/{folder}".format(folder='external_files')
            if not os.path.exists(save_folder):
                os.makedirs(save_folder)

            #Determine file_path    
            file_path = "{path}/{time_now}_{file}".\
                        format(path=save_folder, file=upload.filename, timestamp=time_now)

            #Save the upload to file in chunks            
            upload.save(file_path, overwrite=False, chunk_size=1024)
            return "File successfully saved {0}{1} to '{2}'.".format(file_name, ext, save_folder)

    except KeyboardInterrupt:
        logger.info('%s: ' %(function_name), "Someone pressed CNRL + C")
    except:
        logger.error('%s: ' %(function_name), exc_info=True)
        print("Exception occurred111. Location: %s" %(function_name))
    finally:
        pass

if __name__ == '__main__':
    run(host="localhost", port=8080, reloader=True, debug=True)
else:
    application = default_app()

我也试过做一个file.write但是同样的情况。文件被读取到内存并挂起机器。

file_to_write = open("%s" %(output_file_path), "wb") 
while True:
    datachunk = upload.file.read(1024)
    if not datachunk:
        break
    file_to_write.write(datachunk)

与此相关,我看到了属性MEMFILE_MAX,其中多个SO posts声明可以设置最大文件上传大小。我已经尝试过设置它,但似乎没有任何影响,因为所有文件无论大小都在经历。

请注意,我希望能够接收办公文档,这些文档可以与其扩展程序一起使用,也可以使用密码压缩。

使用Python3.4和瓶子0.12.7

1 个答案:

答案 0 :(得分:1)

基本上,您想在循环中调用upload.read(1024)。像这样(未经测试):

with open(file_path, 'wb') as dest:
    chunk = upload.read(1024)
    while chunk:
        dest.write(chunk)
        chunk = upload.read(1024)

(请勿在{{1​​}}上致电open;它已经为您开放。)

This SO answer包含更多示例,如何在不“哄骗”的情况下阅读大文件。