我正在刮擦申请人跟踪系统(BrassRing)。我可以使用Selenium顺利登录,并获得我感兴趣的网页。在搜索表时,我发现所需的数据存储在jsonGrid
中。
我所能找到的有关Selenium的所有内容,而且抓取并未涵盖如何抓取JSON网格的内容。
此表格/表格中有8列,其下方是所有日期(某些单元格为空,这没关系)。
据我所知,尽管网站显示的列略有不同,但JSON本身中的列仍标记如下:
Action Type
Action Date
Action By
Details
Name
emailfrom
emailto
folderid
这是网站代码的第一部分,显示标题和一些列值。
如果您能提供一些有关如何从网站上抓取JSON网格/ JSON数据的信息,那就太好了。
<input type="hidden" name="Grid$jsonData183" id="Grid_jsonData183" class="jsonGridData" value="[{"ActionType": "Communication - Email","ActionDate": "18-Oct-2019 14:14:25","ActionBy": "Manager, Automation ()","Details": "Status: Sent as to","Name": "<a href=\"#\"/ class=\"ViewCommunication\">Not Selected</a>","emailfrom": "Manager, Automation ()","emailto": "Smith, John(john.smith@notreal.com)","hm_category": "5","hm_Folderid": "6537489","hm_ReqId": "-1","hm_content": "1","hm_md_communication_type": "Communication - Email","hm_md_communication_correspondenceid": "1","hm_md_communication_correspondenceresumeid": "46878397","hm_pushportal": "0","hm_unpostportal": "0","hm_postportall": "0","hm_PortalExpired": "0","hm_md_communication_agencycodetypeid": "0","hm_md_communication_agencycodeid": "0","hm_md_communication_userid": "41","hm_md_RecipientType": "4","hm_EmailLogId": "0","hm_md_ReceiverUserID": "0","hm_md_fid": "6537489","hm_md_rid": "6454343","hm_md_rftid": "17","hm_md_rsts": "0","hm_md_myfolder": "0","foldername": "<a href='#' class='ViewFolder'>1738995BR:Customer Service Associate II</a>","hm_md_afl": "0","hm_md_rfl": "1","hm_md_rlg": "en", "rowmetadata": "<div><div name=\"category\" value=\"5\"></div><div name=\"folderid\" value=\"6537489\"></div><div name=\"reqid\" value=\"-1\"></div><div name=\"content\" value=\"1\"></div><div name=\"md_communication_type\" value=\"Communication+-+Email\"></div><div name=\"md_communication_correspondenceid\" value=\"1\"></div><div name=\"md_communication_correspondenceresumeid\" value=\"46878397\"></div><div name=\"pushportal\" value=\"0\"></div><div name=\"unpostportal\" value=\"0\"></div><div name=\"postportall\" value=\"0\"></div><div name=\"portalexpired\" value=\"0\"></div><div name=\"md_communication_agencycodetypeid\" value=\"0\"></div><div name=\"md_communication_agencycodeid\" value=\"0\"></div><div name=\"md_communication_userid\" value=\"41\"></div><div name=\"md_recipienttype\" value=\"4\"></div><div name=\"emaillogid\" value=\"0\"></div><div name=\"md_receiveruserid\" value=\"0\"></div><div name=\"md_fid\" value=\"6537489\"></div><div name=\"md_rid\" value=\"6454343\"></div><div name=\"md_rftid\" value=\"17\"></div><div name=\"md_rsts\" value=\"0\"></div><div name=\"md_myfolder\" value=\"0\"></div><div name=\"md_afl\" value=\"0\"></div><div name=\"md_rfl\" value=\"1\"></div><div name=\"md_rlg\" value=\"en\"></div></div>"},{"ActionType": "Communication - Email","ActionDate": "18-Oct-2019 13:24:13","ActionBy": "Manager, Automation ()","Details": "Status: Sent as to","Name": "<a href=\"#\"/ class=\"ViewCommunication\">Not Selected</a>","emailfrom": "Manager, Automation ()","emailto": "Smith, John(john.smith@notreal.com)","hm_category": "5","hm_Folderid": "6513663","hm_ReqId": "-1","hm_content": "1","hm_md_communication_type": "Communication - Email","hm_md_communication_correspondenceid":