Python:在mongo插入期间找不到导致bson.errors.InvalidDocument的unicode字段

时间:2015-03-31 07:48:34

标签: python mongodb character-encoding pymongo script-debugging

我正在使用pymongo将复杂结构作为行插入集合中。该结构是一系列词典列表的词典。

有没有办法找到哪个字段是 unicode 而不是 str ,这会导致错误?我试过了:

def dump(obj):
  with open('log', 'w') as flog:
    for attr in dir(obj):
      t, att = type(attr), getattr(obj, attr)
      output =  "obj.%s = %s" % (t, att)
      flog.write(output)

但到目前为止没有运气。

任何巧妙的递归方式都可以打印出来吗?

谢谢

1 个答案:

答案 0 :(得分:0)

以下帮助我找出哪个dict包含unicode值,因为dict可以通过其键识别。列表情况没有帮助。

def find_the_damn_unicode(obj):

    if isinstance(obj, unicode):
        ''' The following conversion probably doesn't do anything meaningfull since
            obj is probably a primitive type, thus passed by value. Thats why encoding
            is also performed inside the for loops below'''
        obj = obj.encode('utf-8')
        return obj

    if isinstance(obj, dict):
        for k, v in obj.items():
            if isinstance(v, unicode):
                print 'UNICODE value with key ', k
                obj[k] = obj[k].encode('utf-8')
            else:
                obj[k] = find_the_damn_unicode(v)

    if isinstance(obj, list):
        for i, v in enumerate(obj):
            if isinstance(v, unicode):
                print 'UNICODE inside a ... list'
                obj[i] = obj[i].encode('utf-8')
            else:
                obj[i] = find_the_damn_unicode(v)

    return obj