编码用于导出的元组(内存)

时间:2016-02-19 17:31:05

标签: python csv

我有一个包含不同值的列表。它看起来像这样:

data = [
('Column1', 'Column2'),
('myFirstNovel', 'myAge'),
('mySecondNovel', 'myAge2'),
('myThirdNovel', 'myAge3'),
('myFourthNovel', 'myAge4')
]

当我将数据写入csv并因此希望在导出之前对数据进行编码时,我遇到了编码错误。所以我尝试了这个:

[[all.encode('utf-8') for all in items] for items in data]

现在这并没有真正解决我的问题(数据填充了\ xe2 \ x80 \ x94 \ xc2 \ xa0和其他东西)。但主要的是它需要很长时间,我的python几乎崩溃。

有更好的方法还是我应该更改导出方法?

(现在使用csv工具和编写器)

1 个答案:

答案 0 :(得分:0)

如果你使用的是python 2.X,你可以使用python在其中提供的unicode_writer类文档:

class UnicodeWriter:
    """
    A CSV writer which will write rows to CSV file "f",
    which is encoded in the given encoding.
    """

    def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
        # Redirect output to a queue
        self.queue = cStringIO.StringIO()
        self.writer = csv.writer(self.queue, dialect=dialect, **kwds)
        self.stream = f
        self.encoder = codecs.getincrementalencoder(encoding)()

    def writerow(self, row):
        self.writer.writerow([s.encode("utf-8") for s in row])
        # Fetch UTF-8 output from the queue ...
        data = self.queue.getvalue()
        data = data.decode("utf-8")
        # ... and reencode it into the target encoding
        data = self.encoder.encode(data)
        # write to the target stream
        self.stream.write(data)
        # empty queue
        self.queue.truncate(0)

    def writerows(self, rows):
        for row in rows:
            self.writerow(row)

在python 3.X中,您只需将编码传递给open函数。