我尝试编写Python 2/3兼容例程来获取CSV文件,将其从latin_1
解码为Unicode并以强大,可扩展的方式将其提供给csv.DictReader
python-future
包括从open
推导builtins
,并导入unicode_literals
以保持一致行为tempfile.SpooledTemporaryFile
io.TextIOWrapper
处理来自latin_1
编码的解码,然后再转到DictReader
这一切在Python 3下运行良好。
问题是TextIOWrapper
期望包装符合BufferedIOBase
的流。不幸的是,在Python 2中,虽然我已经导入了Python 3风格的open
,但是vanilla Python 2 tempfile.SpooledTemporaryFile
当然仍然返回Python 2 cStringIO.StringO
,而不是Python 3 {{1}根据{{1}}的要求。
我可以想到这些可能的方法:
io.BytesIO
包装为Python 3风格TextIOWrapper
。我不确定如何处理这个问题 - 我是否需要编写这样的包装器或者是否已经存在?cStringIO.StringO
流进行解码。我还没找到。io.BytesIO
,完全在内存中解码。 CSV文件需要多大才能完全在内存中运行才能成为一个问题?cStringIO.StringO
,并实施我自己的溢出到磁盘。这样我就可以从python-future中调用SpooledTemporaryFile
,但我不是因为它会非常繁琐,可能不那么安全。前进的最佳方式是什么?我错过了什么吗?
进口:
SpooledTemporaryFile
初始化:
open
例程:
from __future__ import (absolute_import, division,
print_function, unicode_literals)
from builtins import (ascii, bytes, chr, dict, filter, hex, input, # noqa
int, map, next, oct, open, pow, range, round, # noqa
str, super, zip) # noqa
import csv
import tempfile
from io import TextIOWrapper
import requests
错误:
...
self._session = requests.Session()
...
答案 0 :(得分:3)
不确定这是否有用。这种情况只与你的情况模糊不清。
我想使用NamedTemporaryFile来创建一个以UTF-8编码的CSV,并且具有OS本机行结尾,可能不是很完整 - standard,但是可以通过使用Python 3样式轻松实现io.open。
难点在于Python 2中的NamedTemporaryFile打开一个字节流,导致problems with line endings。我认为的解决方案,我认为比Python 2和3的单独案例更好,是创建临时文件然后关闭它并使用io.open重新打开。最后一部分是优秀的backports.csv库,它在Python 2中提供了Python 3样式的CSV处理。
from __future__ import absolute_import, division, print_function, unicode_literals
from builtins import str
import csv, tempfile, io, os
from backports import csv
data = [["1", "1", "John Coltrane", 1926],
["2", "1", "Miles Davis", 1926],
["3", "1", "Bill Evans", 1929],
["4", "1", "Paul Chambers", 1935],
["5", "1", "Scott LaFaro", 1936],
["6", "1", "Sonny Rollins", 1930],
["7", "1", "Kenny Burrel", 1931]]
## create CSV file
with tempfile.NamedTemporaryFile(delete=False) as temp:
filename = temp.name
with io.open(filename, mode='w', encoding="utf-8", newline='') as temp:
writer = csv.writer(temp, quoting=csv.QUOTE_NONNUMERIC, lineterminator=str(os.linesep))
headers = ['X', 'Y', 'Name', 'Born']
writer.writerow(headers)
for row in data:
print(row)
writer.writerow(row)
答案 1 :(得分:1)
tempfile.NamedTemporaryFile()
创建临时文件。然后我们记住它的名字。with
语句,该文件已关闭。io.open()
打开它。nt
)上是否可以在未打开时删除其他用户的文件 - 然后再次创建它但可以访问其内容。如果不可能,请有人纠正我。
以下是我的建议:
# Create temporary file
with tempfile.NamedTemporaryFile() as tf_oldstyle:
# get its file descriptor - note that it will also work with tempfile.TemporaryFile
# which has no meaningful name at all
fd = tf_oldstyle.fileno()
# open that fd with io.open, using desired mode (could use binary mode or whatever)
tf = io.open(fd, 'w+', encoding='utf-8', newline='')
# note we don't use a with statement here, because this fd will be closed once we leave the outer with block
# now work with the tf
writer = csv.writer(tf, ...)
writer.writerow(...)
# At this point, fd is closed, and the file is deleted.
或者我们可以直接使用tempfile.mkstemp()
创建文件并将其名称和fd作为元组返回 - 尽管使用*TemporaryFile
可能更安全&在平台之间移植。
fd, name = tempfile.mkstemp()
try:
tf = io.open(fd, 'w+', encoding='utf-8', newline='')
writer = csv.writer(tf, ...)
writer.writerow(...)
finally:
os.close(fd)
os.unlink(name)
我会尝试在python2下继承SpooledTemporaryFile
并覆盖其rollover
方法。
警告:未经测试。
import io
import sys
import tempfile
if sys.version_info >= (3,):
SpooledTemporaryFile = tempfile.SpooledTemporaryFile
else:
class SpooledTemporaryFile(tempfile.SpooledTemporaryFile):
def __init__(self, max_size=0, mode='w+b', **kwargs):
# replace cStringIO with io.BytesIO or io.StringIO
super(SpooledTemporaryFile, self).__init__(max_size, mode, **kwargs)
if 'b' in mode:
self._file = io.BytesIO()
else:
self._file = io.StringIO(newline='\n') # see python3's tempfile sources for reason
def rollover(self):
if self._rolled:
return
# call super's implementation and then replace underlying file object
super(SpooledTemporaryFile, self).rollover()
fd = self._file.fileno()
name = self._file.name
mode = self._file.mode
delete = self._file.delete
pos = self._file.tell()
# self._file is a tempfile._TemporaryFileWrapper.
# It caches methods so we cannot just replace its .file attribute,
# so let's create another _TemporaryFileWrapper
file = io.open(fd, mode)
file.seek(pos)
self._file = tempfile._TemporaryFileWrapper(file, name, delete)