问题:当POST请求被发送到烧瓶端点,其中字段由原始字节(Apache Avro格式)组成时,烧瓶会自动尝试将字节解码为unicode,这会混乱数据。
例如,当通过python测试客户端发送POST请求时,如下所示:
# part of a python unittest
data = {
'flavor': 'sweet',
}
schema = {
'name': 'SimpleData',
'type': 'record',
'fields': [{'name': 'flavor', 'type': 'string'}]
}
schema = json.dumps(schema)
avro_bytes = to_avro_files(data, schema)
simple_data = {
'avro_bytes': avro_bytes,
'weather': 'warm'
}
# NOTE the avro_bytes here is of type str, and content as follows
(Pdb) avro_bytes
'Obj\x01\x04\x16avro.schema\xb4\x01{"fields": [{"type": "string", "name": "flavor"}], "type": "record", "name": "SimpleData"}\x14avro.codec\x08null\x00\x80\xb0\xef\xc1\xea\xdc\xbc\xb4!\x9c\xb0\xcd\x8eS\xafu\x02\x0c\nsweet\x80\xb0\xef\xc1\xea\xdc\xbc\xb4!\x9c\xb0\xcd\x8eS\xafu'
headers['content-type'] = 'multipart/form'
response = self.client.post('/my_api_endpoint', data=simple_data, headers=headers)
当烧瓶应用程序收到请求时,查看请求正文,我们有以下内容:
(Pdb) request.form['weather']
u'warm'
(Pdb) request.form['avro_bytes']
u'Obj\x01\x04\x16avro.schema\ufffd\x01{"fields": [{"type": "string", "name": "flavor"}], "type": "record", "name": "SimpleData"}\x14avro.codec\x08null\x00\ufffd\ufffd\ufffd\ufffd\ufffd\u073c\ufffd!\ufffd\ufffd\u034eS\ufffdu\x02\x0c\nsweet\ufffd\ufffd\ufffd\ufffd\ufffd\u073c\ufffd!\ufffd\ufffd\u034eS\ufffdu'
其中两个args都转换为unicode,这对于'weather'来说很好,但是'avro_bytes'得到了无效的unicode字符,例如u'\ufffd'
,这使得无法将其编码回原始的avro_bytes。
简而言之,如何防止烧瓶应用尝试将avro_bytes解码为unicode? p>
谢谢!