下面的链接包含我需要抓取的数据:https://jobsearch.svc.dhigroupinc.com/v1/efc/jobs/search?page=1&facets= *&countryCode2 = SG&pageSize = 10&currencyCode = SGD
通过预览,我可以看到有可用数据但被隐藏了。单击链接查看预览图像。 Preview of data
但是,它仅显示: {“消息”:“禁止”}
反正我可以像下面那样检索所需的json数据吗?
{"data":[{"id":"307ocL4mnUnNJT5V","title":"KYC Analyst","jobLocation":{"city":"Singapore",...........
如果需要,这里是网络标头的数据。
我已经使用selenium来检索我想要的数据,但是如果我可以检索json数据,我可以跳过使用selenium,而只使用简单的请求。有什么想法吗?
答案 0 :(得分:1)
The only thing you seem to be missing is the api key. I'm not sure how often (if at all) it changes but I seem to be able to make the correct call simply by adding the x-api-key
to the header.
import json
import requests
base_url = 'https://jobsearch.svc.dhigroupinc.com/v1/efc/jobs/search'
params = {
'page': 1,
'facets': '*',
'countryCode2': 'SG',
'pageSize': 10,
'currencyCode': 'SGD',
}
headers = {
'x-api-key': 'zvDFWwKGZ07cpXWV37lpO5MTEzXbHgyL4rKXb39C'
}
r = requests.get(base_url, headers=headers, params=params)
r.raise_for_status()
# json.dumps only for pretty printing, r.json() is all you need
print(json.dumps(r.json(), indent=2))
Output: