如果内容包含扩展的ASCII字符,则base64.urlsafe_b64encode不正确

时间:2015-07-02 23:57:29

标签: python json xml python-2.7 base64

我有一个与PHP API对话的Python(2.7)客户端。 Python客户端采用一些UTF-8编码的XML,将其转换为JSON,然后将其发送到API。还有其他的事情正在发生,但让我们说这是为了简洁起见。

如果XML中的任何字符串包含扩展的ASCII字符,例如“,”,(印刷术的引号),•(项目符号)以及其他任何内容,我发现客户端无法正确地向API发送数据我能想到。如果XML数据不包含扩展字符,则将数据POST到API就好了,但它确实包含扩展字符,POST数据为空。

我知道这是一个编码问题,但只能确保json.dumps不会破坏带有所有unicode值的dict。

如果XML将扩展字符作为HTML实体,则会出现同样的问题。

我知道当数据通过base64.urlsafe_b64encode传递时会发生这种情况。我知道我错过了什么,但不知道我哪里出错了。

这是我正在进行的简化:

import urllib3
import base64
import hmac
import json
import hashlib

def signString(string_to_sign, shared_secret):
        return hmac.new(shared_secret, string_to_sign, hashlib.sha512).hexdigest()

def send_to_api(action, payload):

    SHARED_SECRET = 'my_secret'

    json_payload = json.dumps(payload, ensure_ascii=False).strip(' \t\n\r')
    json_payload = json_payload.encode('utf-8')

    signature = signString(json_payload, SHARED_SECRET)
    encoded_signature = base64.urlsafe_b64encode(signature.strip(' \t\n\r'))
    encoded_payload = base64.urlsafe_b64encode(json_payload)

    #post = 'v=1.4.8&data={}'.format(encoded_payload)
    headers = { 'Action' : action, 'Signature': encoded_signature }
    pool = urllib3.connectionpool.HTTPSConnectionPool("mysite.com", maxsize=1, block=True, headers=headers, retries=10)
    request = pool.request('POST', '/api/api.v7.php', fields={'v': '1.4.6', 'data': encoded_payload})
    data = request.read()
    request.close()

##etc

关于我可能遗失的任何想法?

修改

这是我通过payload参数传递给send_to_api的源字典:

{'description': u'Before joining together as Fake Tears, Larissa Loyva and Elisha May Rembold were already established singers and songwriters in their own right. The former released two spectral albums under the name Kellarissa, toured the world as a live member of Destroyer and How to Dress Well, and played in past Mint Records acts P:ano and The Choir Practice; the latter leads the folk-rock band the Lost Lovers Brigade and is a member of Shimmering Stars. The pair first collaborated when Loyva joined the lineup of Lost Lovers, and honed their chemistry during a stint as backup vocalists in the funky local ensemble COOL TV.\nTheir current project dates back to 2012, when Loyva and Rembold got together as part of a larger ensemble. \u201cI thought I\u2019d make an all-woman supergroup,\u201d Loyva remembers. \u201cWe started with five, and then, after a couple of months and a couple of practices, we were down to two.'}

这是它在json.dumps运行后的样子,当它变成一个字符串时:

{"description": "Before joining together as Fake Tears, Larissa Loyva and Elisha May Rembold were already established singers and songwriters in their own right. The former released two spectral albums under the name Kellarissa, toured the world as a live member of Destroyer and How to Dress Well, and played in past Mint Records acts P:ano and The Choir Practice; the latter leads the folk-rock band the Lost Lovers Brigade and is a member of Shimmering Stars. The pair first collaborated when Loyva joined the lineup of Lost Lovers, and honed their chemistry during a stint as backup vocalists in the funky local ensemble COOL TV.\nTheir current project dates back to 2012, when Loyva and Rembold got together as part of a larger ensemble. \u201cI thought I\u2019d make an all-woman supergroup,\u201d Loyva remembers. \u201cWe started with five, and then, after a couple of months and a couple of practices, we were down to two."}

今天早上我注意到的一件事是,如果我在将它发送到函数send_to_api()之前打印dict,它的unicode字符串就是la u'my unicode string',但如果我打印有效负载参数在send_to_api()函数中,它似乎不再是unicode,就像上面的源字典一样:

{'description': 'The pair first collaborated when Loyva joined the lineup of Lost Lovers, and honed their chemistry during a stint as backup vocalists in the funky local ensemble COOL TV.\nTheir current project dates back to 2012, when Loyva and Rembold got together as part of a larger ensemble. \xe2\x80\x9cI thought I\xe2\x80\x99d make an all-woman supergroup,\xe2\x80\x9d Loyva remembers. \xe2\x80\x9cWe started with five, and then, after a couple of months and a couple of practices, we were down to two.'}

0 个答案:

没有答案