让我们说:
<div>
<p>this is some text</p>
<p>...and this is some other text</p>
</div>
如何从beautifulsoup中的第二段检索文本?
答案 0 :(得分:13)
您可以使用CSS选择器执行此操作:
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup("""<div>
.... <p>this is some text</p>
.... <p>...and this is some other text</p>
.... </div>""", "html.parser")
>>> soup.select('div > p')[1].get_text(strip=True)
'...and this is some other text'
答案 1 :(得分:9)
您可以使用 nth-of-type :
h = """<div>
<p>this is some text</p>
<p>...and this is some other text</p>
</div>"""
soup = BeautifulSoup(h)
print(soup.select_one("div p:nth-of-type(2)").text)
答案 2 :(得分:2)
secondp = [div.find('p') for div in soup.find('div')]
In : secondp[1].text
Out : Your text
或者您可以直接使用findChildren
-
div_ = soup.find('div').findChildren()
for i, child in enumerate(div_):
if i == 1:
print child.text
答案 3 :(得分:0)
您可以使用gazpacho解决此问题:
from gazpacho import Soup
html = """\
<div>
<p>this is some text</p>
<p>...and this is some other text</p>
</div>
"""
soup = Soup(html)
soup.find('p')[1].text
哪个会输出:
'...这是其他一些文字'