Question

我编写了这段代码，用于从网页中提取所有文本：

from BeautifulSoup import BeautifulSoup
import urllib2

soup = BeautifulSoup(urllib2.urlopen('http://www.pythonforbeginners.com').read())
print(soup.get_text())

问题是我收到此错误：

print(soup.get_text())
TypeError: 'NoneType' object is not callable

有关如何解决此问题的任何想法？

Answer 1

该方法称为soup.getText()，即camelCased。

为什么你得到TypeError而不是AttributeError这对我来说是一个谜！

Answer 2

正如Markku在评论中建议的那样，我建议你破解你的代码。

from BeautifulSoup import BeautifulSoup
import urllib2

URL = "http://www.pythonforbeginners.com"
page = urllib2.urlopen('http://www.pythonforbeginners.com')
html = page.read()
soup = BeautifulSoup(html)
print(soup.get_text())

如果它仍然不起作用，请输入一些打印语句以查看正在发生的事情。

from BeautifulSoup import BeautifulSoup
import urllib2

URL = "http://www.pythonforbeginners.com"
print("URL is {} and its type is {}".format(URL,type(URL)))
page = urllib2.urlopen('http://www.pythonforbeginners.com')
print("Page is {} and its type is {}".format(page,type(page))
html = page.read()
print("html is {} and its type is {}".format(html,type(html))
soup = BeautifulSoup(html)
print("soup is {} and its type is {}".format(soup,type(soup))
print(soup.get_text())

使用get_text时，'NoneType'对象不可调用beautifulsoup错误

2 个答案: