我正在尝试提取跨度中的值,但是跨度已嵌入另一个值中。我想知道如何只获得1个跨度的值,而不是两个都得到。
from bs4 import BeautifulSoup
some_price = page_soup.find("div", {"class":"price_FHDfG large_3aP7Z"})
some_price.span
# that code returns this:
'''
<span>$289<span class="rightEndPrice_6y_hS">99</span></span>
'''
# BUT I only want the $289 part, not the 99 associated with it
进行此调整后:
some_price.span.text
解释器返回
$28999
是否可以以某种方式删除最后的“ 99”?还是只提取跨度的第一部分?
任何帮助/建议将不胜感激!
答案 0 :(得分:0)
您可以从soup.contents
属性访问所需的值:
from bs4 import BeautifulSoup as soup
html = '''
<span>$289<span class="rightEndPrice_6y_hS">99</span></span>
'''
result = soup(html, 'html.parser').find('span').contents[0]
输出:
'$289'
因此,在原始div
查找的背景下:
result = page_soup.find("div", {"class":"price_FHDfG large_3aP7Z"}).span.contents[0]