为什么TextIOWrapper关闭给定的BytesIO流?

时间:2018-01-25 01:50:36

标签: python python-3.x csv bytesio

如果我在python 3中运行以下代码

from io import BytesIO
import csv
from io import TextIOWrapper


def fill_into_stringio(input_io):
    writer = csv.DictWriter(TextIOWrapper(input_io, encoding='utf-8'),fieldnames=['ids'])
    for i in range(100):
        writer.writerow({'ids': str(i)})

with BytesIO() as input_i:
    fill_into_stringio(input_i)
    input_i.seek(0)

我收到错误:

ValueError: I/O operation on closed file.

如果我不使用TextIOWrapper,则io流保持打开状态。例如,如果我将我的功能修改为

def fill_into_stringio(input_io):
    for i in range(100):
        input_io.write(b'erwfewfwef')

我不会再出现任何错误,因此由于某种原因,TestIOWrapper正在关闭我之后想要阅读的流。这是不是就是这样的,如果没有自己编写csv编写器,是否有办法实现我的目标?

1 个答案:

答案 0 :(得分:3)

csv模块在​​这里很奇怪;包装其他对象的大多数类似文件的对象都会占用相关对象的所有权,当它们自己关闭时关闭它(或以其他方式清理)。

避免此问题的一种方法是在允许清除之前明确detach TextIOWrapper

def fill_into_stringio(input_io):
    # write_through=True prevents TextIOWrapper from buffering internally;
    # you could replace it with explicit flushes, but you want something 
    # to ensure nothing is left in the TextIOWrapper when you detach
    text_input = TextIOWrapper(input_io, encoding='utf-8', write_through=True)
    try:
        writer = csv.DictWriter(text_input, fieldnames=['ids'])
        for i in range(100):
            writer.writerow({'ids': str(i)})
    finally:
        text_input.detach()  # Detaches input_io so it won't be closed when text_input cleaned up

避免这种情况的唯一其他内置方法是真实文件对象,您可以在其中传递文件描述符和closefd=False,并且在close时它们不会关闭基础文件描述符 - 编辑或以其他方式清理。

当然,在您的特定情况下,有更简单的方法:只需使您的函数期望基于文本的文件类对象并使用它们而不重新包装;你的函数真的不应该对调用者的输出文件强加编码负责(如果调用者想要UTF-16输出怎么办?)。

然后你可以这样做:

from io import StringIO

def fill_into_stringio(input_io):
    writer = csv.DictWriter(input_io, fieldnames=['ids'])
    for i in range(100):
        writer.writerow({'ids': str(i)})

# newline='' is the Python 3 way to prevent line-ending translation
# while continuing to operate as text, and it's recommended for any file
# used with the csv module
with StringIO(newline='') as input_i:
    fill_into_stringio(input_i)
    input_i.seek(0)
    # If you really need UTF-8 bytes as output, you can make a BytesIO at this point with:
    # BytesIO(input_i.getvalue().encode('utf-8'))