想知道如何在下面的html上定位“ Switch”文本:
<div class="product_title">
<a href="/game/pc/into-the-breach" class="hover_none">
<h1>Into the Breach</h1>
</a>
<span class="platform">
<a href="/game/pc">
PC
</a>
</span>
</div>
<div class="product_data">
<ul class="summary_details">
<li class="summary_detail publisher" >
<span class="label">Publisher:</span>
<span class="data">
<a href="/company/subset-games" >
Subset Games
</a>
</span>
</li>
<li class="summary_detail release_data">
<span class="label">Release Date:</span>
<span class="data" >Feb 27, 2018</span>
</li>
<li class="summary_detail product_platforms">
<span class="label">Also On:</span>
<span class="data">
<a href="/game/switch/into-the-breach" class="hover_none">Switch</a> </span>
</li>
</ul>
</div>
到目前为止,我还使用以下代码捕获了“ Also On:”文本(带有很多空格):
self.playable_on_systems_label.setText(self.html_soup.find("span", class_='platform').text.strip() + ', ' + self.html_soup.find("li", class_='summary_detail product_platforms').text.strip())
如何捕获(在这种情况下)仅“ Switch”文本?
仅供参考-对于语句的前半部分(捕获“ PC”),文本不是“也可以”文本就可以正常工作
预先感谢
答案 0 :(得分:0)
您的查询将使用class="summary_detail product_platforms"
获取整个span元素,该元素将包括从“ Also On:”到“ Switch”的所有文本。尝试类似.find('a', href=re.compile("^.+switch.+$"))
之类的方法,或尝试使用CSS .select("a[href*=switch]")
(solution from here)
答案 1 :(得分:0)
您可以使用BeautifulSoup select()
函数导航“ Switch”文本,检查此代码!!!
rom bs4 import BeautifulSoup
html = '''<div class="product_title">
<a class="hover_none" href="/game/pc/into-the-breach">
<h1>Into the Breach</h1>
</a>
<span class="platform">
<a href="/game/pc">
PC
</a>
</span>
</div>
<div class="product_data">
<ul class="summary_details">
<li class="summary_detail publisher">
<span class="label">Publisher:</span>
<span class="data">
<a href="/company/subset-games">
Subset Games
</a>
</span>
</li>
<li class="summary_detail release_data">
<span class="label">Release Date:</span>
<span class="data">Feb 27, 2018</span>
</li>
<li class="summary_detail product_platforms">
<span class="label">Also On:</span>
<span class="data">
<a class="hover_none" href="/game/switch/into-the-breach">Switch</a> </span>
</li>
</ul>
</div>'''
soup = BeautifulSoup(html, 'html.parser')
text = soup.select('.summary_detail.product_platforms .hover_none')[0].text.strip()
print(text)
输出:
Switch