Question

我正在使用python请求来抓取网站。我设置的标题与浏览器完全相同，我使用request.Session（）对象启用了cookie。状态码为200，但该网站尝试执行JS质询。这是我的bs4汤结果。

headers = {'User-Agent': generate_user_agent(os=('mac', 'linux')),
           'accept-language': 'it-IT,it;q=0.9,en-US;q=0.8,en;q=0.7',
           'cache-control': 'no-cache',
           'pragma': 'no-cache',
           'upgrade-insecure-requests': '1',
           'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3'}


session = requests.Session()
r = session.get(url=link, headers=headers)
soup = BeautifulSoup(r.text, 'lxml')

我想执行或模仿这个js脚本（我尝试过硒，但是它很慢，而且request-html无法正常工作。）

谢谢！

here the js script (print(soup)) ---> https://pastebin.com/yRxbUxkm

无法从网站执行或模仿JS脚本（挑战）

0 个答案: