python:错误处理带有unicode数据的有序dict

时间:2017-03-29 10:22:33

标签: python mysql mongodb unicode pymongo

我的脚本将数据从MySQL迁移到mongodb。当没有包含unicode列时,它运行得非常好。但是在添加OrgLanguages列时会将错误抛到一边。

    mongoImp = dbo.insert_many(odbcArray)
  File "/home/lrsa/.local/lib/python2.7/site-packages/pymongo/collection.py", line 711, in insert_many
    blk.execute(self.write_concern.document)
  File "/home/lrsa/.local/lib/python2.7/site-packages/pymongo/bulk.py", line 493, in execute
    return self.execute_command(sock_info, generator, write_concern)
  File "/home/lrsa/.local/lib/python2.7/site-packages/pymongo/bulk.py", line 319, in execute_command
    run.ops, True, self.collection.codec_options, bwc)
bson.errors.InvalidStringData: strings in documents must be valid UTF-8: 'Portugu\xeas do Brasil, ?????, English, Deutsch, Espa\xf1ol latinoamericano, Polish'

我的代码:

import MySQLdb, MySQLdb.cursors, sys, pymongo, collections

odbcArray=[]
mongoConStr = '192.168.10.107:36006'
sqlConnect = MySQLdb.connect(host = "54.175.170.187", user = "testuser", passwd = "testuser", db = "testdb", cursorclass=MySQLdb.cursors.DictCursor)
mongoConnect = pymongo.MongoClient(mongoConStr)

sqlCur = sqlConnect.cursor()
sqlCur.execute("SELECT ID,OrgID,OrgLanguages,APILoginID,TransactionKey,SMTPSpeed,TimeZoneName,IsVideoWatched FROM organizations")

dbo = mongoConnect.eaedw.mysqlData
tuples = sqlCur.fetchall()

for tuple in tuples:
    odbcArray.append(collections.OrderedDict(tuple))

mongoImp = dbo.insert_many(odbcArray)

sqlCur.close()
mongoConnect.close()
sqlConnect.close()
sys.exit()

上面的脚本在SELECT查询中没有OrgLanguages列的情况下尝试完全迁移数据。 为了解决这个问题,我试图以另一种方式使用OrderedDict()但是给了我一种不同类型的错误 改变代码:

for tuple in tuples:
    doc = collections.OrderedDict()
    doc['oid'] = tuple.OrgID
    doc['APILoginID'] = tuple.APILoginID
    doc['lang'] = unicode(tuple.OrgLanguages)
    odbcArray.append(doc)
mongoImp = dbo.insert_many(odbcArray)

收到错误:

Traceback (most recent call last):
  File "pymsql.py", line 19, in <module>
    doc['oid'] = tuple.OrgID
AttributeError: 'dict' object has no attribute 'OrgID'

1 个答案:

答案 0 :(得分:0)

您的MySQL连接返回的字符数与UTF-8不同,这是所有BSON字符串必须包含的编码。请尝试使用原始代码,但将charset='utf8'传递给MySQLdb.connect