如何从<a> when searching for text in beautiful soup and

时间:2019-02-24 00:22:40

标签: python beautifulsoup

I'm working with selenium and BeautifulSoup and Selenium to perform a data extract.

This page is paginated. I know that this link exists somewhere on the page:

<a href="/DP/changeQueryPageAction.do?pager.offset=20">[ Next &gt; ]</a>

This url is in a random location on the page, so what I need to do is find the text and extract the href.

How do I ask bs4 to find the text, and give me the href?

Thanks

1 个答案:

答案 0 :(得分:3)

要基于文本或任何其他属性查找元素,必须使用re模块来获取它。

import bs4
import re
html_doc="""<html><a href="/DP/changeQueryPageAction.do?pager.offset=20">[ Next &gt; ]</html></a>"""
soup = bs4.BeautifulSoup(html_doc, 'html.parser')
Search_Text=soup.find('a' , text=re.compile("Next"))
print(Search_Text['href'])

输出:

/DP/changeQueryPageAction.do?pager.offset=20

请告诉我它是否适合您。