I'm working with selenium and BeautifulSoup and Selenium to perform a data extract.
This page is paginated. I know that this link exists somewhere on the page:
<a href="/DP/changeQueryPageAction.do?pager.offset=20">[ Next > ]</a>
This url is in a random location on the page, so what I need to do is find the text and extract the href.
How do I ask bs4 to find the text, and give me the href?
Thanks
答案 0 :(得分:3)
要基于文本或任何其他属性查找元素,必须使用re
模块来获取它。
import bs4
import re
html_doc="""<html><a href="/DP/changeQueryPageAction.do?pager.offset=20">[ Next > ]</html></a>"""
soup = bs4.BeautifulSoup(html_doc, 'html.parser')
Search_Text=soup.find('a' , text=re.compile("Next"))
print(Search_Text['href'])
输出:
/DP/changeQueryPageAction.do?pager.offset=20
请告诉我它是否适合您。