使用Python3抓取JSON网格数据

时间:2019-10-23 16:53:47

标签: json python-3.x web-scraping

我正在刮擦申请人跟踪系统(BrassRing)。我可以使用Selenium顺利登录,并获得我感兴趣的网页。在搜索表时,我发现所需的数据存储在jsonGrid中。

我所能找到的有关Selenium的所有内容,而且抓取并未涵盖如何抓取JSON网格的内容。

此表格/表格中有8列,其下方是所有日期(某些单元格为空,这没关系)。

据我所知,尽管网站显示的列略有不同,但JSON本身中的列仍标记如下:

Action Type
Action Date
Action By
Details
Name
emailfrom
emailto
folderid

这是网站代码的第一部分,显示标题和一些列值。

如果您能提供一些有关如何从网站上抓取JSON网格/ JSON数据的信息,那就太好了。

<input type="hidden" name="Grid$jsonData183" id="Grid_jsonData183" class="jsonGridData" value="[{&quot;ActionType&quot;: &quot;Communication - Email&quot;,&quot;ActionDate&quot;: &quot;18-Oct-2019 14:14:25&quot;,&quot;ActionBy&quot;: &quot;Manager, Automation ()&quot;,&quot;Details&quot;: &quot;Status: Sent as to&quot;,&quot;Name&quot;: &quot;&lt;a href=\&quot;#\&quot;/  class=\&quot;ViewCommunication\&quot;&gt;Not Selected&lt;/a&gt;&quot;,&quot;emailfrom&quot;: &quot;Manager, Automation ()&quot;,&quot;emailto&quot;: &quot;Smith, John(john.smith@notreal.com)&quot;,&quot;hm_category&quot;: &quot;5&quot;,&quot;hm_Folderid&quot;: &quot;6537489&quot;,&quot;hm_ReqId&quot;: &quot;-1&quot;,&quot;hm_content&quot;: &quot;1&quot;,&quot;hm_md_communication_type&quot;: &quot;Communication - Email&quot;,&quot;hm_md_communication_correspondenceid&quot;: &quot;1&quot;,&quot;hm_md_communication_correspondenceresumeid&quot;: &quot;46878397&quot;,&quot;hm_pushportal&quot;: &quot;0&quot;,&quot;hm_unpostportal&quot;: &quot;0&quot;,&quot;hm_postportall&quot;: &quot;0&quot;,&quot;hm_PortalExpired&quot;: &quot;0&quot;,&quot;hm_md_communication_agencycodetypeid&quot;: &quot;0&quot;,&quot;hm_md_communication_agencycodeid&quot;: &quot;0&quot;,&quot;hm_md_communication_userid&quot;: &quot;41&quot;,&quot;hm_md_RecipientType&quot;: &quot;4&quot;,&quot;hm_EmailLogId&quot;: &quot;0&quot;,&quot;hm_md_ReceiverUserID&quot;: &quot;0&quot;,&quot;hm_md_fid&quot;: &quot;6537489&quot;,&quot;hm_md_rid&quot;: &quot;6454343&quot;,&quot;hm_md_rftid&quot;: &quot;17&quot;,&quot;hm_md_rsts&quot;: &quot;0&quot;,&quot;hm_md_myfolder&quot;: &quot;0&quot;,&quot;foldername&quot;: &quot;&lt;a href=&#39;#&#39; class=&#39;ViewFolder&#39;&gt;1738995BR:Customer Service Associate II&lt;/a&gt;&quot;,&quot;hm_md_afl&quot;: &quot;0&quot;,&quot;hm_md_rfl&quot;: &quot;1&quot;,&quot;hm_md_rlg&quot;: &quot;en&quot;, &quot;rowmetadata&quot;: &quot;&lt;div&gt;&lt;div name=\&quot;category\&quot; value=\&quot;5\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;folderid\&quot; value=\&quot;6537489\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;reqid\&quot; value=\&quot;-1\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;content\&quot; value=\&quot;1\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;md_communication_type\&quot; value=\&quot;Communication+-+Email\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;md_communication_correspondenceid\&quot; value=\&quot;1\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;md_communication_correspondenceresumeid\&quot; value=\&quot;46878397\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;pushportal\&quot; value=\&quot;0\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;unpostportal\&quot; value=\&quot;0\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;postportall\&quot; value=\&quot;0\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;portalexpired\&quot; value=\&quot;0\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;md_communication_agencycodetypeid\&quot; value=\&quot;0\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;md_communication_agencycodeid\&quot; value=\&quot;0\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;md_communication_userid\&quot; value=\&quot;41\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;md_recipienttype\&quot; value=\&quot;4\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;emaillogid\&quot; value=\&quot;0\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;md_receiveruserid\&quot; value=\&quot;0\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;md_fid\&quot; value=\&quot;6537489\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;md_rid\&quot; value=\&quot;6454343\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;md_rftid\&quot; value=\&quot;17\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;md_rsts\&quot; value=\&quot;0\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;md_myfolder\&quot; value=\&quot;0\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;md_afl\&quot; value=\&quot;0\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;md_rfl\&quot; value=\&quot;1\&quot;&gt;&lt;/div&gt;&lt;div name=\&quot;md_rlg\&quot; value=\&quot;en\&quot;&gt;&lt;/div&gt;&lt;/div&gt;&quot;},{&quot;ActionType&quot;: &quot;Communication - Email&quot;,&quot;ActionDate&quot;: &quot;18-Oct-2019 13:24:13&quot;,&quot;ActionBy&quot;: &quot;Manager, Automation ()&quot;,&quot;Details&quot;: &quot;Status: Sent as to&quot;,&quot;Name&quot;: &quot;&lt;a href=\&quot;#\&quot;/  class=\&quot;ViewCommunication\&quot;&gt;Not Selected&lt;/a&gt;&quot;,&quot;emailfrom&quot;: &quot;Manager, Automation ()&quot;,&quot;emailto&quot;: &quot;Smith, John(john.smith@notreal.com)&quot;,&quot;hm_category&quot;: &quot;5&quot;,&quot;hm_Folderid&quot;: &quot;6513663&quot;,&quot;hm_ReqId&quot;: &quot;-1&quot;,&quot;hm_content&quot;: &quot;1&quot;,&quot;hm_md_communication_type&quot;: &quot;Communication - Email&quot;,&quot;hm_md_communication_correspondenceid&quot;:

0 个答案:

没有答案