无法在Python上解析非ASCII URL?

时间:2015-11-09 18:40:17

标签: python-2.7 urllib

我想查询Freebase API以获取JoséMourinho所效力的球队名单。

所以,我在浏览器上使用的URL是

https://www.googleapis.com/freebase/v1/mqlread?query=[{"name": "José Mourinho","/sports/pro_athlete/teams": [{"mid": null,"team": null,"to": null,"optional": true}]}]

然而,

import json
import urllib

service_url="https://www.googleapis.com/freebase/v1/mqlread"
query = '[{"name": "' + "José Mourinho" + '","/sports/pro_athlete/teams": [{"mid": null,"team": null,"to": null,"optional": true}]}]'
url = service_url + '?' + 'query='+query
response = json.loads(urllib.urlopen(url).read())

给我一​​个错误说,

UnicodeError: URL u'https://www.googleapis.com/freebase/v1/mqlread?query=[{"name": "Jos\xe9 Mourinho","/sports/pro_athlete/teams": [{"mid": null,"team": null,"to": null,"optional": true}]}]' contains non-ASCII characters

这是什么解决方案?

1 个答案:

答案 0 :(得分:1)

我认为你跳过了一点the docs。试试这个:

# coding=UTF-8

import json
import urllib

service_url = "https://www.googleapis.com/freebase/v1/mqlread"
query = [{
    '/sports/pro_athlete/teams': [
        {
            'to': None,
            'optional': True,
            'mid': None,
            'team': None
        }
    ],
    'name': 'José Mourinho'
}]

url = service_url + '?' + urllib.urlencode({'query': json.dumps(query)})
response = json.loads(urllib.urlopen(url).read())

print response

不要自己构建查询字符串,而是使用json.dumpsurllib.urlencode为您创建查询字符串。他们擅长这一点。

注意:如果您可以使用requests包,那么最后一位可能是:

import requests
response = requests.get(service_url, params={'query': json.dumps(query)})

然后你可以跳过URL构造并完全逃避!