如果我尝试使用pymongo在mongodb中包含变音字符(ñ)的更新值,则会抛出
strings in documents must be valid UTF-8: "ccopy_reg\n_reconstructor\np0\n(ctextblob.cla...
我尝试编码手册:
enc='UTF-8'
content = request.get_data() # raw encoded content
u_content = content.decode(enc) # decodes from enc to unicode
utf8_content = u_content.encode("UTF-8")
如果我使用而不是enc ='UTF-8'其他编码它工作,但变音字符是错误的。如果我不尝试解码和编码我得到相同的异常
所有代码:
try:
# Load params arriving as json data
enc='UTF-8'
content = request.get_data() # raw encoded content
print repr(content)
u_content = content.decode(enc) # decodes from enc to unicode
utf8_content = u_content.encode('UTF-8')
params = json.loads(utf8_content)
# Check all parameters
customer_id = params.get('customer', '')
check_credentials(customer_id, params.get('apikey', ''))
collection_id = params.get('collection', '')
if not collection_id or not str(collection_id).isdigit():
raise Exception, "Invalid collection"
train_records = params.get('train', [])
if not train_records:
raise Exception, "Train records are needed in the 'train' parameter"
# Store the trained classifier in database for a better performance
train_records = map(lambda x: x.values(), train_records)
cl = NaiveBayesClassifier(train_records)
pk = '%s__%i' % (customer_id, collection_id)
data = {'_id': pk, 'customer': customer_id, 'collection': collection_id, 'classifier': pickle.dumps(cl), 'train':train_records}
if db.classifiers.find_one({'_id': pk}):
db.classifiers.update({'_id': pk}, data)
else:
db.classifiers.insert(data)
print 'ok'
# Asyncronously increase usage count in order to check rate limits
gevent.spawn(increase_usage, customer_id)
except Exception as e:
print e
这里我有异常db.classifiers.update({'_id': pk}, data)
此行后params = json.loads(utf8_content)
- 从\ xc3 \ xb1转换为u'\ xf1'