我的问题的概括地说是,我的脚本不能写完整Unicode字符串(从DB中检索)到csv,代替仅在第一个字符的每个串被写入到文件中。例如:
U,1423.0,831,1,139
输出应该是:
University of Washington Students,1423.0,831,1,139
一些背景知识:我正在使用pyodbc连接到MSSQL数据库。我将我的odbc配置文件设置为unicode,并按如下方式连接到db:
p.connect("DSN=myserver;UID=username;PWD=password;DATABASE=mydb;CHARSET=utf-8")
我可以获得数据没有问题,但是当我尝试将查询结果保存到csv文件时会出现问题。我尝试过使用csv.writer,官方文档中的UnicodeWriter解决方案,以及最近在github上找到的unicodecsv模块。每种方法都会产生相同的结果。
奇怪的是我可以在python控制台中打印字符串没问题。然而,如果我使用相同的字符串并将其写入csv,问题就出现了。看我的测试代码&结果如下:
突出问题的代码:
print "'Raw' string from database:"
print "\tencoding:\t" + whatisthis(report.data[1][0])
print "\tprint string:\t" + report.data[1][0]
print "\tstring len:\t" + str(len(report.data[1][0]))
f = StringIO()
w = unicodecsv.writer(f, encoding='utf-8')
w.writerows(report.data)
f.seek(0)
r = unicodecsv.reader(f)
row = r.next()
row = r.next()
print "Write/Read from csv file:"
print "\tencoding:\t" + whatisthis(row[0])
print "\tprint string:\t" + row[0]
print "\tstring len:\t" + str(len(row[0]))
测试结果:
'Raw' string from database:
encoding: unicode string
print string: University of Washington Students
string len: 66
Write/Read from csv file:
encoding: unicode string
print string: U
string len: 1
这个问题可能是什么原因以及如何解决?谢谢!
编辑:whatisthis函数只是检查字符串格式,取自this post
def whatisthis(s):
if isinstance(s, str):
print "ordinary string"
elif isinstance(s, unicode):
print "unicode string"
else:
print "not a string"
答案 0 :(得分:1)
import StringIO as sio
import unicodecsv as ucsv
class Report(object):
def __init__(self, data):
self.data = data
report = Report(
[
["University of Washington Students", 1, 2, 3],
["UCLA", 5, 6, 7]
]
)
print report.data
print report.data[0][0]
print "*" * 20
f = sio.StringIO()
writer = ucsv.writer(f, encoding='utf-8')
writer.writerows(report.data)
print f.getvalue()
print "-" * 20
f.seek(0)
reader = ucsv.reader(f)
row = reader.next()
print row
print row[0]
--output:--
[['University of Washington Students', 1, 2, 3], ['UCLA', 5, 6, 7]]
University of Washington Students
********************
University of Washington Students,1,2,3
UCLA,5,6,7
--------------------
[u'University of Washington Students', u'1', u'2', u'3']
University of Washington Students
谁知道你的whatisthis()函数是什么恶作剧。