将unicode插入sqlite?

时间:2012-02-11 05:44:15

标签: python unicode sqlite

我还在学习Python,作为一个小项目我编写了一个脚本,它将我在文本文件中的值插入到sqlite3数据库中。但有些名字有奇怪的字母(我猜你会把它们称为非ASCII),并在它们出现时产生错误。这是我的小脚本(请告诉我,无论如何它可能更像Pythonic):     import sqlite3

f = open('complete', 'r')
fList = f.readlines()
conn = sqlite3.connect('tpb')
cur = conn.cursor()

for i in fList:
    exploaded = i.split('|')
    eList = (
        (exploaded[1], exploaded[5])
    )
    cur.execute('INSERT INTO magnets VALUES(?, ?)', eList)
    conn.commit()
cur.close()

它会产生此错误:

Traceback (most recent call last):
  File "C:\Users\Admin\Desktop\sortinghat.py", line 13, in <module>
    cur.execute('INSERT INTO magnets VALUES(?, ?)', eList)
sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a te
xt_factory that can interpret 8-bit bytestrings (like text_factory = str). It is
highly recommended that you instead just switch your application to Unicode str
ings.

1 个答案:

答案 0 :(得分:4)

要将文件内容转换为unicode,您需要根据其所在的编码进行解码 它看起来像你在Windows上,所以一个好的赌注是cp1252 如果您从其他地方获得该文件,则所有投注均已关闭。

一旦您对编码进行了排序,一种简单的解码方法就是使用codecs模块,例如:

import codecs
# ...
with codecs.open('complete', encoding='cp1252') as fin: # or utf-8 or whatever
  for line in fin:
    to_insert = (line.split('|')[1], line.split('|')[5])
    cur.execute('INSERT INTO magnets VALUES (?,?)', to_insert)
    conn.commit()
# ...