在mongo集合中保存微型标志字符

时间:2012-08-17 23:26:55

标签: python unicode python-unicode

我正在使用python脚本来创建基于MySql数据库的mongo集合。问题在于微型符号:

bson.errors.InvalidStringData: strings in documents must be valid UTF-8: '\xb5g'

我尝试使用不同的代码(utf-8,latin-1,cp1252,iso-8859-2)对值进行编码/解码但没有成功,但我总是收到以下错误:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xb5 in position 0: ordinal not in   range(128)

这是从mysql db获取数据的代码。该数据库是USDA 0

    # -*- encoding: utf-8 -*-

    import MySQLdb
    mysqldb = MySQLdb.connect(DBCONF)
    cursor = mysqldb.cursor()
    foodid = 1001
    q = (
        ' SELECT nut.Nutr_Val,'
        ' nutdef.Units,'
        ' nutdef.NutrDesc, nutdef.Tagname'
        ' FROM food_des AS f'
        ' JOIN nutrient AS nut ON nut.NDB_No = f.NDB_No'
        ' JOIN nutrient_def AS nutdef ON nutdef.Nutr_No = nut.Nutr_No'
        ' WHERE f.NDB_No = %s'
    ) % str(foodid)
    self.cursor.execute(q)

带有微型符号字符的字段是nutdef.Units one。

1 个答案:

答案 0 :(得分:1)

尝试将字符解码为latin-1:

a = '\xb5g'
# '\xb5g'
print a
# ?g

b = a.decode('latin-1')
print b
# µg

b
# u'\xb5g'

或者你可以通过告诉它在所有CHAR,VARCHAR和TEXT字段上使用unicode来解决这个问题:

MySQLdb.connect(..., use_unicode=True)