大家好
from bs4 import BeautifulSoup as b
data = """
<div class="hello1">
<span class="string1">This is string 1</span>
<span class="string2">This is string 2</span>
</div>
<div class="hello2">
<span class="string1">Another String 1</span>
</div>"""
bsObj = b(data, 'html.parser')
print(bsObj.find('span', 'string'))
现在我只想解析“另一个字符串1”,但是当我运行代码时,结果是“这是字符串1”。
如果我将查找结果更改为findAll,它会从div.hello1和div.hello2打印string1,但我只想要div.hello2中的跨度
答案 0 :(得分:0)
您必须告诉BS 哪里您要搜索跨度:
bsObj.find('div','hello2').find('span','string1')
#<span class="string1">Another String 1</span>
答案 1 :(得分:0)
您可以使用CSS选择器通过方法select()
/ select_one()
来定位标签。选择器div.hello2 span
将定位到<span>
标签下的<div>
标签,类别为hello2
:
from bs4 import BeautifulSoup as b
data = """
<div class="hello1">
<span class="string1">This is string 1</span>
<span class="string2">This is string 2</span>
</div>
<div class="hello2">
<span class="string1">Another String 1</span>
</div>"""
bsObj = b(data, 'html.parser')
print(bsObj.select_one('div.hello2 span').text)
打印:
Another String 1