我有一个函数来从wikimedia api获取文章的文本
def get_content(lang, title):
url = "https://"+ lang +".wikipedia.org/w/api.php"
params={'action':'query', 'format':'json', 'titles': title, 'prop':
'linkshere','redirects':'true','prop':'extracts','exlimit':'max','explaintext':'true'}
response = requests.get(url=url,params=params).json()
content = response["query"]["pages"]
content = six.next(six.itervalues(content))['extract']
return content
然而,当我调用此函数时,会发生错误:
Traceback (most recent call last):
File "/home/klux/anaconda3/lib/python3.5/site-
packages/requests/adapters.py", line 423, in send
timeout=timeout
File "/home/klux/anaconda3/lib/python3.5/site-
packages/requests/packages/urllib3/connectionpool.py", line 640,
in urlopen
_stacktrace=sys.exc_info()[2])
File "/home/klux/anaconda3/lib/python3.5/site-
packages/requests/packages/urllib3/util/retry.py", line 287, in i
ncrement
raise MaxRetryError(_pool, url, error or ResponseError(cause))
requests.packages.urllib3.exceptions.MaxRetryError:
HTTPSConnectionPool(host='xn--mesut%20zil-57a.wikipedia.org',
port=443): Max retries exceeded with url: /w/api.php?
exlimit=max&titles=en&action=query&format=json&prop=extract
s&redirects=true&explaintext=true (Caused by
NewConnectionError('<requests.packages.urllib3.connection.VerifiedHT
TPSConnection object at 0x7f7a5e74c0b8>: Failed to establish a new
connection: [Errno -2] Name or service not kno
wn',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "get_data.py", line 165, in <module>
print(get_content('Mesut Özil','en'))
File "get_data.py", line 147, in get_content
response = request_get_data(url=url,params=params).json()
File "/home/klux/anaconda3/lib/python3.5/site-
packages/requests/api.py", line 70, in get
return request('get', url, params=params, **kwargs)
File "/home/klux/anaconda3/lib/python3.5/site-
packages/requests/api.py", line 56, in request
return session.request(method=method, url=url, **kwargs)
File "/home/klux/anaconda3/lib/python3.5/site-
packages/requests/sessions.py", line 475, in request
resp = self.send(prep, **send_kwargs)
File "/home/klux/anaconda3/lib/python3.5/site-
packages/requests/sessions.py", line 596, in send
r = adapter.send(request, **kwargs)
File "/home/klux/anaconda3/lib/python3.5/site-
packages/requests/adapters.py", line 487, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='xn--
mesut%20zil-57a.wikipedia.org', port=443): Max
retries exceeded with url: /w/api.php?
exlimit=max&titles=en&action=query&format=json
&prop=extracts&redirects=true&explaintext=true (Caused by
NewConnectionError
('<requests.packages.urllib3.connection.VerifiedHTTPSConnection o
bject at 0x7f7a5e74c0b8>: Failed to establish a new connection: [Errno
-2] Name or service not known',))
首先,我认为我被维基媒体阻止了太多请求,但是当我在“主要功能”中运行该代码时,它完全像这样。
if __name__ == '__main__':
lang = 'en'
title = 'Mesut Özil'
url = "https://"+ lang +".wikipedia.org/w/api.php"
params={'action':'query', 'format':'json', 'titles': title, 'prop': 'linkshere',
'redirects':'true','prop':'extracts','exlimit':'max','explaintext':'true'}
response = requests.get(url=url, params=params).json()
content = response["query"]["pages"]
content = six.next(six.itervalues(content))['extract']
print(content)
输出:
Mesut Özil (German pronunciation: [ˈmeːzut ˈøːzil], Turkish: [meˈsut
ˈøzil]; born 15 October 1988) is a German fo
otballer who plays for English club Arsenal and the German national
team...
我不知道如何解决这种奇怪的行为,我尝试通过visual studio代码进行调试,response
函数中的get_content
变量转向undefined
。那么有人对这种情况有任何解决方案吗?
答案 0 :(得分:1)
宣布你的职能:
def get_content(lang, title):
但我在你的追溯中看到了:
print(get_content('Mesut Özil','en'))
我很确定没有https://Mesut Özil.wikipedia.org