Question

我的脚本写在下面，发现soup.get_text()命令有错误。代码：

from BeautifulSoup import *
soup=BeautifulSoup(open("F:\\HTML\\Registrationform.html"))
print soup.get_text('+')

错误：文件“C:/Python27/beautifulsoup4-4.6.0.tar/scrapingbasic.py”，第3行，

 print soup.get_text('+')
TypeError: 'NoneType' object is not callable

Answer 1

BeautifulSoup类需要构造函数中的html / xml内容。因此，.read()功能添加open应该可行。这是代码：

from BeautifulSoup import *
soup=BeautifulSoup(open("F:\\HTML\\Registrationform.html").read())

print soup.get_text('+')

另外，我建议您升级到BeautifulSoup4。

希望这有帮助。

Answer 2

Beautifulsoup需要html / xml文档。你能否检查python 2.x是否可以解析你的html文件，只是为了重新检查。另一个问题可能发生在Windows上，需要确保lxml库安装成功。您还可以从https://www.crummy.com/software/BeautifulSoup/bs4/doc/

重新检查文档

以下部分：

<link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" rel="stylesheet" />
<div class="row offer-row">
  <div class="col-xs-8">

    <img src="http://via.placeholder.com/825x500" class="img img-responsive">

  </div>
  <div class="col-xs-4">
    <img src="http://via.placeholder.com/410x500" class="img img-responsive">
  </div>
</div>

BeautifulSoup：解析HTML文件时出现NoneType错误

2 个答案: