如何在python中删除字节顺序标记

时间:2014-07-03 13:03:42

标签: python byte-order-mark

此问题与我最近报告的Stack Overflow API的更改here有关。在那个问题上,我收到了response似乎可以正常工作,但实际上我无法让它发挥作用。

这是我的代码

import requests
import json
url="https://api.stackexchange.com/2.2/sites/?filter=%21%2AL1%2AAY-85YllAr2%29&pagesize=1&page=1"
response = requests.get(url)
response.text

此输出

u'\ufeff{"items":[{"site_state":"normal","api_site_parameter":"stackoverflow","name":"Stack Overflow"}],"has_more":true,"quota_max":300,"quota_remaining":294}'

领先u'\ufeff表示如果我response.json()我得到ValueError: No JSON object could be decoded

我提供的建议是使用decode('utf-8-sig')。但是,我似乎无法开展这项工作:

尝试1:

response.text.decode('utf-8-sig')
UnicodeEncodeError: 'ascii' codec can't encode character u'\ufeff' in position 0: ordinal not in range(128)

尝试2:

json.loads(response.text).decode('utf-8-sig')
ValueError: No JSON object could be decoded

删除前导u'\ufeff的适当方法是什么?

1 个答案:

答案 0 :(得分:5)

response.text是一个Unicode对象,i。即它已被解码,因此您无法再对其进行解码。

您需要做的是告诉response对象应该使用哪种编码:

response = requests.get(url)
response.encoding = "utf-8-sig"
respose.text

请参阅docs for more background info