<p>
<strong>
<em>
Insurtech
</em>
</strong>
</p>
<p> .....Some data </p>
<p>
<strong>
<em>
Biometrics
</em>
</strong>
</p>
我试过这个:
html_tags = soup.find_all(&#39; em&#39;)
对于范围内的i(len(html_tags)-1):
start_tag = html_tags [i]
end_tag = html_tags [i + 1]
between_tag =(soup_str.split(str(start_tag)))[1] .split(str(end_tag))[0]
soup1 = BeautifulSoup(between_tag,&#39; html.parser&#39;)
我想要从p->strong->em
到下一个p->strong->em
标记的所有数据。这是我的示例数据。提前谢谢**
答案 0 :(得分:2)
s = '''<p>
<strong>
<em>
Insurtech
</em>
</strong>
</p>
<p> .....Some data </p>
<p>
<strong>
<em>
Biometrics
</em>
</strong>
</p>'''
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
>>> list(soup.stripped_strings)
['Insurtech', '.....Some data', 'Biometrics']
答案 1 :(得分:0)
您可以使用.text
方法访问所需信息。
<强>实施例强>
s = """<p>
<strong>
<em>
Insurtech
</em>
</strong>
</p>
<p> .....Some data </p>
<p>
<strong>
<em>
Biometrics
</em>
</strong>
</p>"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(s, "html.parser")
html_tags = soup.find_all('p')
for h in html_tags:
print(h.text.strip()) #-->Update.
<强>输出:强>
Insurtech
.....Some data
Biometrics