python - 尝试使用API调用抓取网站 - Thinbug

尝试使用API调用抓取网站

时间：2020-09-28 18:05:52

标签： python api web-scraping beautifulsoup

我正在尝试使用python中的API调用来抓取网站https://www.jobijoba.com/fr/query/?what=&where=Ile-de-france&where_type=region。我正在使用请求库发送请求。但是很遗憾，我无法访问数据。我在下面分享了我的代码。我如何才能有效地抓取网站。我应该使用selenium webdriver来完成此任务吗？任何帮助将不胜感激。

import requests
headers = {
    'Accept': '*/*',
    'Referer': 'https://www.jobijoba.com/fr/query/?what=&where=Ile-de-france&where_type=region',
    'X-Requested-With': 'XMLHttpRequest',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36',
    'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
}

data = {
  'where': 'Ile-de-france',
  'where_type': 'region',
  'perimeter': '20',
  'duration': '',
  'period': '',
  'publication': '',
  'contract': '',
  'formation': 'false',
  'jobbing': 'false',
  'page': '4',
  'editor_id': '54'
}

response = requests.post('https://www.jobijoba.com/fr/url_api', headers=headers, data=data)

1 个答案:

答案 0 :(得分：0)

您的脚本确实给出了响应，可以通过以下方式访问

response.json()

我没有上下文来说明该响应的意义，但是绝对有一个有效的API响应。