从强标签抓取网页

时间:2021-04-11 11:06:42

标签: python html beautifulsoup tags

如何从强标签中提取数据?

HTML code:
<div class="store-views"><span class="caption">VISIT COUNT</span><br><strong>336</strong></div>

我尝试过soup.find("strong") 和soup.find("div", class_="store-views") 但它要么提供了错误的数据,要么提供了“无”

1 个答案:

答案 0 :(得分:0)

该值是动态添加的,可能来自谷歌分析。您可以使用 selenium 自动化浏览器,以便在添加时捕获此值:

from selenium import webdriver

d = webdriver.Chrome()
d.get('https://store.bricklink.com/legoseller9997&utm_content=globalnav#/shop')
print(d.find_element_by_css_selector('.store-views strong').text)
d.quit()

其他数据来自ajax请求:

import requests

r = requests.get('https://store.bricklink.com/ajax/clone/store/searchitems.ajax?showHomeItems=1&sid=1663355', headers= {'User-Agent':'Mozilla/5.0'}).json()
print(r)