我正在尝试从此网站上刮取一张桌子
https://pipeline.thedeal.com/search/AdvancedDealSearch.dl?preset=bankruptcyfilings#results
,但获取完整表格的唯一方法是反复单击表格底部的显示更多。我没有使用数千个条目来执行此操作,而是使用Chrome的开发工具来获取POST请求,该请求在每次按下show more按钮时被发送。
Google Dev Tools显示以下请求有效负载:
{“ newSearch”:true,“ searchParams”:“ {\” dealTypes \“:[\”破产申报\“],\” sortBy \“:\” announcedDate \“,\” sortDirection \“:\ “ desc \”}“,” page“:2}
import requests
import json
url = 'https://pipeline.thedeal.com/search/AdvancedDealSearch.dl'
payload = {"newSearch":'true',"searchParams":"{\"dealTypes\":[\"Bankruptcy Filing\"],\"sortBy\":\"announceDate\",\"sortDirection\":\"desc\"}","page":'2'}
r = requests.post(url, data = json.dumps(payload))
print(r.text)
但是,当我尝试使用以下python代码时,却不断收到415错误。我怀疑这与请求的格式有关。有什么想法可能会出错吗?谢谢