Question

我正在处理的项目接受并打开维基主页，打开页面上属于类别的每个链接，然后在每个类别页面上获取前10个链接并将它们写入文件。

代码：

url_list = open('url_list', 'w')

counter = 0

urls = []

html = urllib.request.urlopen('https://commons.wikipedia.org/wiki/Main_Page')

soup = bs.BeautifulSoup(html, 'lxml')

for item in soup.find_all('a'):
    urls.append(item.get('href'))

    for item in urls:

        if 'Category' in item:
            page = urllib.request.urlopen('https://commons.wikipedia.org/' + item)

            soup = bs.BeautifulSoup(page, 'lmxl')

            if counter < 10:
                for item in soup.find_all('a'):
                    url_list.write(item.get('href'))

                    counter += 1

url_list.close()

当我运行代码时，我得到了这个TypeError：

Traceback (most recent call last):
File "/Users/huntergary/Web_links.py", line 42, in <module>
main()
File "/Users/huntergary/Web_links.py", line 23, in main
if 'Category' in item:
TypeError: argument of type 'NoneType' is not iterable

Answer 1

检查是否在追加SparkR::sparkR.session()项目之前返回该项目，或在尝试查看import json data = { 'Account1': { 'User1': 'Last Used Date', 'User2': ' ' }, 'Account2': { 'User3': ' ' } } print(json.dumps(data))是否在其中之前检查'href'：

item

或者，

'Category'

任何一种方法都应该阻止您检查href = item.get('href') if href is not None: urls.append(href)列表中的if item is not None and 'Category' in item:个对象。

作为旁注，您应该考虑不在这样的嵌套上下文中重复使用变量名None三次。在代码的更深层中，并不总是清楚您要引用哪个urls。

搜索列表时出现TypeError

1 个答案: