如何修复JSON无效控制字符错误?

时间:2014-09-19 17:40:14

标签: python json

FWIW,我已经使用了其他JSON API而没有问题(Tumblr,Twitter等)。

我的Google Blogger客户端使用Google的Blogger API,因此我可以执行诸如从用户下载博客帖子或下载用户图片等操作。

以下是我的Blogger客户端的重要部分:

    import requests
    #import json
    #import demjson

class BloggerClient(object):
    """ Client interface for Blogger API. """
    def __init__(self, key):
        self.key = key
        self.base = 'https://www.googleapis.com/blogger/v3'

    def _send_request(self, url, parameters={}):
        """ Sends an HTTP GET request to Blogger API.
            Returns JSON decoded response. """

        # Format the full URL
        url = '{base}{url}?'.format(base=self.base, url=url)

        # API key is always required, so add it to parameters
        parameters['key'] = self.key

        try:
            # Requests module formats parameters into the URL for me
            r = requests.get(url, params=parameters)
            print 'Connecting:', r.url, '\n'            # debug
        except:
            print "** Could not reach url:\n", url
            return
        return r.json()

if __name__ == '__main__':
    pass

如果我使用相同的有效输入运行脚本5次,则会失败一次。

这是我刚刚得到的错误(即使它在1分钟前在同一输入上运行时成功了):

Traceback (most recent call last):
  File "G:\programming\python\corerip\blogger_test.py", line 8, in <module>
    posts = b.get_all_posts()
  File "G:\programming\python\corerip\site_blogger.py", line 25, in get_all_pos
s
    page_of_posts = self.client.get_posts(blog_id)
  File "G:\programming\python\corerip\client_blogger.py", line 64, in get_posts
    return self._send_request(api_url, kwargs)
  File "G:\programming\python\corerip\client_blogger.py", line 30, in _send_req
est
    return r.json()
  File "C:\Python27\lib\site-packages\requests-2.3.0-py2.7.egg\requests\models.
y", line 763, in json
    return json.loads(self.text, **kwargs)
  File "C:\Python27\lib\json\__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "C:\Python27\lib\json\decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Python27\lib\json\decoder.py", line 382, in raw_decode
    obj, end = self.scan_once(s, idx)
ValueError: Invalid control character at: line 101 column 3095 (char 38567)

应该使用此JSON回复:

http://pastebin.com/5iW1C8Dc

输入网址为:

https://www.googleapis.com/blogger/v3/blogs/233501935878754401/posts ?键= [YOUR_GOOGLE_API_KEY_HERE]

如果我再次运行脚本,它可能会成功。这对我来说没什么意义。也许它与JSON模块有关,默认情况下没有正确处理unicode?

有人可以帮我解决这个随机错误吗?它似乎是随机的这一事实让我大吃一惊。

我几乎尝试了所有json模块,在使用Google的API时,它们都会随机失败。

编辑: 我试图逃避&#39;使用原始字符串的字符,但也失败:

    api_response = r'''{rtext}'''.format(rtext=r.text)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa3' in position 274
75: ordinal not in range(128)

0 个答案:

没有答案