龙卷风大文件下载

时间:2016-06-10 16:30:01

标签: python io tornado

我正在尝试龙卷风的内容处理。我的文件读写代码如下:

with open(file_name, 'rb') as f:
        while True:
            data = f.read(4096)
            if not data:
                break
            self.write(data)
    self.finish()

我希望内存使用率保持一致,因为它不是一次性读取所有内容。但资源监视器显示:

In use    Available
12.7 GB   2.5GB

有时它甚至会使我的电脑失灵...... 如何下载大文件(比如12GB大小)?

1 个答案:

答案 0 :(得分:0)

tornado 6.0提供了一个api下载大文件,可能使用如下所示:

import aiofiles

async def get(self):

    self.set_header('Content-Type', 'application/octet-stream')
    # the aiofiles use thread pool,not real asynchronous
    async with aiofiles.open(r"F:\test.xyz","rb") as f:
        while True:
            data = await f.read(1024)
            if not data:
                break
            self.write(data)
            # flush method call is import,it makes low memory occupy,beacuse it send it out timely
            self.flush()

仅使用aiofile而不使用self.flush()可能无法解决问题。

只看方法self.write():

def write(self, chunk: Union[str, bytes, dict]) -> None:
    """Writes the given chunk to the output buffer.

    To write the output to the network, use the `flush()` method below.

    If the given chunk is a dictionary, we write it as JSON and set
    the Content-Type of the response to be ``application/json``.
    (if you want to send JSON as a different ``Content-Type``, call
    ``set_header`` *after* calling ``write()``).

    Note that lists are not converted to JSON because of a potential
    cross-site security vulnerability.  All JSON output should be
    wrapped in a dictionary.  More details at
    http://haacked.com/archive/2009/06/25/json-hijacking.aspx/ and
    https://github.com/facebook/tornado/issues/1009
    """
    if self._finished:
        raise RuntimeError("Cannot write() after finish()")
    if not isinstance(chunk, (bytes, unicode_type, dict)):
        message = "write() only accepts bytes, unicode, and dict objects"
        if isinstance(chunk, list):
            message += (
                ". Lists not accepted for security reasons; see "
                + "http://www.tornadoweb.org/en/stable/web.html#tornado.web.RequestHandler.write"  # noqa: E501
            )
        raise TypeError(message)
    if isinstance(chunk, dict):
        chunk = escape.json_encode(chunk)
        self.set_header("Content-Type", "application/json; charset=UTF-8")
    chunk = utf8(chunk)
    self._write_buffer.append(chunk)

在代码末尾:它只是将要发送的数据附加到_write_buffer。

当get或post方法完成并调用finish方法时,将发送数据。

关于龙卷风的冲洗机的文档是:

http://www.tornadoweb.org/en/stable/web.html?highlight=flush#tornado.web.RequestHandler.flush