Python BeautifulSoup - 如何在<a> tags</a>之间提取文本

时间:2014-12-21 17:15:52

标签: python html beautifulsoup

我想提取一个数字&#34; 371&#34;来自这个来源与Python3中的BeautifulSoup4。 我尝试了很多次,但我无法使它工作,你能帮助我吗?谢谢。

<a href="/ProviderRedirect.ashx?key=0.16198127.422314246.13.PLN.1277906077&amp;saving=551&amp;source=1-0" id="TotalLink" target="_blank"><span class="hc_pr_cur">PLN</span> 371<span class="hc_pr_syb"></span></a>

1 个答案:

答案 0 :(得分:0)

找到span代码并获取.next_sibling

soup.find('span', class_='hc_pr_cur').next_sibling.strip()

演示:

>>> from bs4 import BeautifulSoup
>>>
>>> data = '<a href="/ProviderRedirect.ashx?key=0.16198127.422314246.13.PLN.1277906077&amp;saving=551&amp;source=1-0" id="TotalLink" target="_blank"><span class="hc_pr_cur">PLN</span> 371<span class="hc_pr_syb"></span></a>'
>>>
>>> soup = BeautifulSoup(data)
>>> soup.find('span', class_='hc_pr_cur').next_sibling.strip()
u'371'