我想要一个以下代码的** unicode类型** ::

时间:2012-10-04 18:44:28

标签: python string parsing unicode

我从SO专家处获得了以下代码,但它适用于ANSI Strings,我的输入是 UNICODE STRING 。如何使这个代码适用于这两个版本? TIA

import csv
from collections import defaultdict
summary = defaultdict(list)
csvin = csv.reader(open('qwetry.txt'), delimiter='|')
for row in csvin:
    summary[(row[1].split()[0], row[2])].append(int(row[5]))
csvout = csv.writer(open('datacopy.out','wb'), delimiter='|')
for who, what in summary.iteritems():
    csvout.writerow( [' '.join(who), len(what), sum(what)] )

courtsey:Jon Clements

1 个答案:

答案 0 :(得分:0)

csv 模块不直接支持读写Unicode。你可以找到详细信息here。它的生成器如下::

 import csv

def unicode_csv_reader(unicode_csv_data, dialect=csv.excel, **kwargs):
    # csv.py doesn't do Unicode; encode temporarily as UTF-8:
    csv_reader = csv.reader(utf_8_encoder(unicode_csv_data),
                            dialect=dialect, **kwargs)
    for row in csv_reader:
        # decode UTF-8 back to Unicode, cell by cell:
        yield [unicode(cell, 'utf-8') for cell in row]

def utf_8_encoder(unicode_csv_data):
    for line in unicode_csv_data:
        yield line.encode('utf-8')