转储网页时搜索字符串

时间:2018-03-31 15:01:27

标签: python html python-3.x python-requests

我正在转发包含请求的网页,我需要在代码中搜索一个字符串。我需要获取div字段并搜索某个值。

import requests, pprint
page = requests.get('')
tree = (page.content)
pp = pprint.PrettyPrinter(indent=4)
pp.pprint(tree)

1 个答案:

答案 0 :(得分:0)

你可以使用BeautifulSoup来做这些事情 - 它是一个HTML解析器。

请参阅:Beautiful Soup and extracting a div and its contents by ID

导入请求,pprint import bs4

page = requests.get('')
tree = (page.content)
soup = bs4.BeautifulSoup('<html><body><div id="articlebody"> ... </div></body></html')
divs = soup.find_all("div")
texts = [i.text for i in soup.find_all("div")]