从隐藏在R中的API后面的网页中提取JSON

时间:2018-05-04 01:26:31

标签: r json

我会直白。我想获得本网站上显示的数据:Liberty Shares

到目前为止,我已尝试过:

  • read_html那样定期解析并从rvest库中进入节点。这会加载页面标题,但不会加载页面上的实际表格。
  • phantomjs捕获页面。除了页面上的表格之外,这一切都得到了我想要的信息。
  • 卷曲。我通过浏览器进入网络并仅查看XHR元素获得了curl命令。包含数据的JSON位于加载的第二个searchentity元素内。通过清理curl链接,它无法生成一个实际的JSON文件,说它格式不正确。

curl "https://api.fundpress.io/fund/searchentity"-H "x-ksys-token: 92e05e6b-acca-4238-86a1-4f6dc857f319" -H "accept-language: en-US,en;q=0.9" -H "user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36" -H "content-type: application/json;charset=UTF-8" -H "accept: application/json, text/plain, */*" -H "referer: https://www.libertyshares.com/fund-explorer/" -H "authority: api.fundpress.io" -H "accept-encoding: gzip, deflate, br" --data-binary "{"type":"CLSS","clientCode":[""],"fundList":"us","search":[{"property":"status","values":["Active","Smart Beta","Passive"],"matchtype":"LIKE"}],"include":{"statistics":{}},"limit":"","start":0,"translate":true,"sourceCulture":"en-US","culture":"en-US"}" --compressed > liberty.html

我能想到的唯一另一个选择是使用Selenium但我想尽可能避免使用它。

0 个答案:

没有答案