我正在尝试从 a 标签内的 span 标签获取值(餐厅名称)。
有很多标签和跨度标签。我使用此代码到达了这里:
soup.find_all("a", "biz-name js-analytics-click")
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" data-
hovercard-id="hN6KsYexY7_4VPAw0mHtMA" href="/biz/szechuan-restaurant-
charlottesville?osq=chinese"><span>Szechuan Restaurant</span></a>
所以基本上我不知道如何找到 span 标签的值,该标签位于 a 标签中,具有特定类=“ biz-name js” -analytics-click“
答案 0 :(得分:0)
尝试span.text
例如:
from bs4 import BeautifulSoup
s = """<a class="biz-name js-analytics-click" data-analytics-label="biz-name" data-
hovercard-id="hN6KsYexY7_4VPAw0mHtMA" href="/biz/szechuan-restaurant-
charlottesville?osq=chinese"><span>Szechuan Restaurant</span></a>"""
soup = BeautifulSoup(s, "html.parser")
for tag in soup.find_all("a", "biz-name js-analytics-click"):
print(tag.span.text)
或 "biz-name js-analytics-click"
是唯一的
print( soup.find("a", "biz-name js-analytics-click").span.text )
输出:
Szechuan Restaurant
答案 1 :(得分:0)
无需复杂的代码,BeautifulSoup以方法select()
和select_one()
(docs here)的形式支持CSS选择器。
如果要在<span>
标签中找到<a>
和biz-name
类的js-analytics-click
标签,请使用选择器'a.biz-name.js-analytics-click span'
:
data = """<a class="biz-name js-analytics-click" data-analytics-label="biz-name" data-
hovercard-id="hN6KsYexY7_4VPAw0mHtMA" href="/biz/szechuan-restaurant-
charlottesville?osq=chinese"><span>Szechuan Restaurant</span></a>"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(data, 'lxml')
print(soup.select_one('a.biz-name.js-analytics-click span').text)
输出:
Szechuan Restaurant