我有以下格式的html。
<div class="consider">
<div class="row">
<p>Text1</p>
</div>
</div>
<div class="consider">
<h2>Hello</h2>
</div>
<div class="Consider">
<div class="row">
<p>Text2
</div>
</div>
我只想在其子标记(div)的类为“行”的情况下获取标记div
答案 0 :(得分:0)
这是您访问它的方式:
from bs4 import BeautifulSoup
content = '<div class="consider"><div class="row"><p>Text1</p></div></div><div class="consider"><h2>Hello</h2></div><div class="Consider"><div class="row"><p>Text2</p></div></div>'
soup = BeautifulSoup(content, 'lxml')
for div in soup.find_all('div', class_='row'):
if div.parent.name == "div":
#do whatever you want with div.parent which is the element you want.
答案 1 :(得分:0)
使用select('div > div.row')
,我们将所有具有class row的div标签选择为div标签的直接子元素,然后通过列表理解,选择这些标签的所有父元素:
data = '<div class="consider"><div class="row"><p>Text1</p></div></div><div class="consider"><h2>Hello</h2></div><div class="Consider"><div class="row"><p>Text2</p></div></div>'
from bs4 import BeautifulSoup
soup = BeautifulSoup(data, 'lxml')
divs = [div.parent for div in soup.select('div > div.row')]
print(divs)
输出:
[<div class="consider"><div class="row"><p>Text1</p></div></div>, <div class="Consider"><div class="row"><p>Text2</p></div></div>]