Question

我目前正在使用Pyramid on Python并在Ubuntu 14.04上运行的Web应用程序的导出功能。它将文件压缩到NamedTemporaryFile并通过FileResponse发回：

# Create the temporary file to store the zip
with NamedTemporaryFile(delete=True) as output:
    map_zip = zipfile.ZipFile(output, 'w', zipfile.ZIP_DEFLATED)
    length_mapdir = len(map_directory)

    for root, dirs, files in os.walk(map_directory, followlinks=True):
        for file in files:
            file_path = os.path.join(root, file)
            map_zip.write(file_path, file_path[length_mapdir:])

    map_zip.close()

    #Send the response as an attachement to let the user download the file
    response = FileResponse(os.path.abspath(output.name))
    response.headers['Content-Type'] = 'application/download'
    response.headers['Content-Disposition'] = 'attachement; filename="'+filename+'"'
    return response

在客户端，导出需要一些时间，然后出现文件下载弹出窗口，没有任何问题，所有内容都按计划在zip中。

当文件正在压缩时，我可以看到文件在/ tmp /中占用越来越多的大小，并且在下载弹出窗口出现之前，文件消失了。我假设这是NamedTemporaryFile。

在压缩或下载文件时，使用的RAM量没有任何重大变化，当实际拉链超过800mb时，它保持在40mb左右。

金字塔从哪里下载文件？根据我对tempfile的理解，它在关闭时取消链接。如果这是真的，是否有可能另一个进程可以写入存储文件的内存，从而破坏正在下载的金字塔？

Answer 1

在Unix环境中，在创建和打开文件时会使用称为引用计数的东西。对于文件上的每个open()调用，参考编号会增加，对于每个close()，它会减少。 unlink()的特殊之处在于，当调用该文件时，文件与目录树取消链接，但只要引用计数保持在0以上，它就会保留在磁盘上。

在您的情况下，NamedTemporaryFile()在名为/tmp/somefile

的磁盘上创建一个文件

/tmp/somefile现在的链接数为1
/tmp/somefile然后调用open()，以便它可以将文件返回给您，这会将引用计数增加到1
/tmp/somefile随后由您的代码写入，在本例中为zip文件
/tmp/somefile然后传递给FileResponse()，然后open()调用它，将引用计数增加到2
退出with语句的范围，NamedTemporaryFile()调用close()后跟unlink()。您的文件现在有1个引用，链接数为0.由于引用仍然存在，该文件仍然存在于磁盘上，但在搜索时不再可见。
FileResponse()由您的WSGI服务器迭代，最后一旦文件被完全读取，您的WSGI服务器就会调用close()，将引用计数丢弃为0，此时文件系统将完全清理文件

最后一点是文件不再可访问。与此同时，您的文件是完全安全的，并且无法在内存或其他方式覆盖它。

话虽如此，如果FileResponse()延迟加载（例如，在WSGI服务器开始发送响应之前它不会open()该文件），那么完全有可能会过早地尝试open()临时文件，NamedTemporaryFile()已经删除了该文件。请记住一些事情。

从金字塔FileResponse下载NamedTemporaryFile是否安全？

1 个答案: