Question

from urllib.request import urlopen
from bs4 import BeautifulSoup
html= urlopen("http://www.pythonscraping.com/pages/page3.html")
soup= BeautifulSoup(html.read())
print(soup.find("img",{"src":"../img/gifts/img1.jpg"
}).parent.previous_sibling.get_text())

上面的代码工作正常但不是下面的代码。它给出了如上所述的属性错误。谁能告诉我原因？

from urllib.request import urlopen       
from bs4 import BeautifulSoup
html= urlopen("http://www.pythonscraping.com/pages/page3.html")
soup= BeautifulSoup(html.read())
price =soup.find("img",{"src=":"../img/gifts/img1.jpg"
}).parent.previous_sibling.get_text()
print(price)

谢谢！ :)

Answer 1

如果您比较第一个和第二个版本，您会注意到：

首先： soup.find("img",{"src":"../img/gifts/img1.jpg"}).parent.previous_sibling.get_text()

注意："src"

第二名： soup.find("img","src=":"../img/gifts/img1.jpg"}).parent.previous_sibling.get_text()

注意："src="

第二个代码返回Attribute Error:'NoneType' object has no attribute 'parent'，因为它无法在提供的汤中找到src=="../img/gifts/img1.jpg"。

因此，如果您删除第二个版本中的=，它应该可以正常工作。

顺便说一句，您应该明确要使用哪个解析器，否则bs4将返回以下警告：

UserWarning：没有明确指定解析器，所以我使用了最好的解析器   此系统的可用HTML解析器（＆＃34; lxml＆＃34;）。这通常不是a   问题，但如果您在另一个系统上运行此代码，或在另一个系统上运行此代码   在虚拟环境中，它可能使用不同的解析器并表现出来   不同。

要摆脱此警告，请更改如下所示的代码：

BeautifulSoup（[你的标记]）

到此：

BeautifulSoup（[你的标记]，＆＃34; lxml＆＃34;）

因此，正如警告消息中所述，您只需将soup = BeautifulSoup(html.read())更改为soup = BeautifulSoup(html.read(), 'lxml')，例如。

属性错误：＆＃39; NoneType＆＃39;对象没有属性＆＃39; parent＆＃39;

1 个答案: