我的问题是,例如,如果我找到顶级类别(类别:全部),我可以使用找到的子类别再次迭代循环,但我不能让它们嵌套在我的词典中。
是否有人可以提供帮助或看到错误。?
提前致谢,
import requests # http://docs.python-requests.org/en/latest/
import json
from bs4 import BeautifulSoup
category = 'Categorie:Alles'
def wiki_api_request(category):
url = ('http://nl.wikipedia.org/w/api.php?format=json&action=query&list=categorymembers&cmtitle=%s&cmlimit=500')%category
return url
category_dict = {}
def crawl(category_name, _dict):
url = wiki_api_request(category_name)
_url = requests.get(url)
extract = _url.json()
category_amount = 0
if 'query' in extract:
category_list_json = extract['query']['categorymembers']
_dict[category_name] = {category['title'] for category in category_list_json}
for category in category_list_json:
if 'Categorie:' in category['title']:
crawl(category['title'], _dict[category_name] ** <-This gives an error**)
break
crawl(category, category_dict)
print category_dict
错误:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-40-b8027c8281eb> in <module>()
29 break
30
---> 31 crawl(category, category_dict)
32 print category_dict
<ipython-input-40-b8027c8281eb> in crawl(category_name, _dict)
26 for category in category_list_json:
27 if 'Categorie:' in category['title']:
---> 28 crawl(category['title'], _dict[category_name])
29 break
30
<ipython-input-40-b8027c8281eb> in crawl(category_name, _dict)
22 if 'query' in extract:
23 category_list_json = extract['query']['categorymembers']
---> 24 _dict[category_name] = {category['title'] for category in category_list_json}
25
26 for category in category_list_json:
TypeError: 'set' object does not support item assignment
答案 0 :(得分:2)
{category['title'] for category in category_list_json}
是一种集合理解,而不是字典理解。因此,分配给_dict
的结果将为set
。
你可能想要一个带有空字典的字典作为理解结果的值,所以
{category['title']:{} for category in category_list_json}
或更明确地
{category['title']:dict() for category in category_list_json}