Question

我试图执行以下python代码

import httplib2
from BeautifulSoup import BeautifulSoup, SoupStrainer

http = httplib2.Http()
status, response = http.request('http://www.nytimes.com')

for link in BeautifulSoup(response, parseOnlyThese=SoupStrainer('a')):
    if link.has_attr('href'):
        print link['href']

编辑：我将代码更改为此

for link in BeautifulSoup(response).find_all('a', href=True):
    print link['href']

但仍然得到同样的错误

我收到错误

Traceback (most recent call last):
  File "/home/user1/Documents/machinelearning/extract_links.py", line 8, in <module>
    if link.has_attr('href'):
TypeError: 'NoneType' object is not callable

出现此错误的原因是什么？我该如何解决这个问题？

Answer 1

您的列表会返回一堆值以及None。

在我看来，最好在这里使用find_all()：

for link in BeautifulSoup(response).find_all('a', href=True):
    print link['href']

href=True只能找到href值的代码，因此您不需要有条件的。

如果link.has_attr（'href'）键入错误：TypeError：'NoneType'对象不可调用

1 个答案: