Question

所以我可以找到关于使用id，class等查找特定标签的beautifulsoup文档...但它没有谈论如何从标签内提取数据而不是它周围的内容。

我的问题：

<img src=yellowbar.png width=63.94 height=10><img src=redbar.png width=36.0632181423 height=10><br />
Power:</b> 1480 / 1480<br />
<img src=yellowbar.png width=100 height=10><img src=redbar.png width=0 height=10><br />

我有这个HTML。页面上总共有20个标签，其中3个标签为src=yellowbar.png

我的目标是，选择第二个，并获得宽度。所以我猜它会去：

查找代码 - ＆gt; find src = yellowbar.png - ＆gt;选择第二个 - ＆gt;打印宽度。

我该怎么做？

到目前为止，我已设法打印所有标签的列表。

soup = BeautifulSoup(element, "lxml")

tag = soup.find_all('img')
print(tag)

返回

[<img height="10" src="yellowbar.png" width="77"/>, <img height="10" src="redbar.png" width="0"/>]

Answer 1

如果我能理解你的问题，那么这可以解决你的问题。

from bs4 import BeautifulSoup

content = """
<img src=yellowbar.png width=63.94 height=10><img src=redbar.png width=36.0632181423 height=10><br />
Power:</b> 1480 / 1480<br />
<img src=yellowbar.png width=100 height=10><img src=redbar.png width=0 height=10><br />
"""
soup = BeautifulSoup(content,"lxml")
for tags in soup.find_all("img",{"src":"yellowbar.png"}): #use the attributes as well to specify the item you look for
    print(tags['width']) #access the value using attribute

输出：

63.94
100

从Beautifulsoup而不是文本中获取标签的宽度/属性

1 个答案: