将JSON嵌套转换为CSV转换

时间:2017-08-06 18:30:21

标签: json api csv terminal transpose

当我使用ProPublica的API时,我可以通过终端使用以下方式获得第115届国会议员名单:

curl "https://api.propublica.org/congress/v1/115/senate/members.json" -H "X-API-Key: "MY_API_KEY"

我得到一个JSON响应,如下所示:

{
   "status":"OK",
   "copyright":" Copyright (c) 2017 Pro Publica Inc. All Rights Reserved.",
   "results":[
  {
     "congress": "115",
     "chamber": "Senate",


     "num_results": 101,
     "offset": 0,
     "members": [
          {
             "id": "A000360",
             "api_uri":"https://api.propublica.org/congress/v1/members/A000360.json",
             "first_name": "Lamar",
             "middle_name": null,
             "last_name": "Alexander",
             "date_of_birth": "1940-07-03",
             "party": "R",
             "leadership_role": null,
             "twitter_account": "SenAlexander",
             "facebook_account": "senatorlamaralexander",
             "youtube_account": "lamaralexander",
             "govtrack_id": "300002",
             "cspan_id": "5",
             "votesmart_id": "15691",
             "icpsr_id": "40304",
             "crp_id": "N00009888",
             "google_entity_id": "/m/01rbs3",
             "url": "https://www.alexander.senate.gov/public",
             "rss_url": "https://www.alexander.senate.gov/public/?a=RSS.Feed",
             "contact_form": "http://www.alexander.senate.gov/public/index.cfm?p=Email",
             "domain": null,
             "in_office": true,
             "dw_nominate": 0.323,
             "ideal_point": null,
             "seniority": "15",
             "next_election": "2020",
             "total_votes": 187,
             "missed_votes": 7,
             "total_present": 0,
             "ocd_id": "ocd-division/country:us/state:tn",
             "office": "455 Dirksen Senate Office Building",
             "phone": "202-224-4944",
             "fax": "202-228-3398",
             "state": "TN",
             "senate_class": "2",
             "state_rank": "senior",
             "lis_id": "S289"
             ,"missed_votes_pct": 3.74,
             "votes_with_party_pct": 98.89
           },
                       {
             "id": "B000575",
             "api_uri":"https://api.propublica.org/congress/v1/members/B000575.json",
             "first_name": "Roy",
             "middle_name": null,
             "last_name": "Blunt",
             "date_of_birth": "1950-01-10",
             "party": "R",
             "leadership_role": null,
             "twitter_account": "RoyBlunt",
             "facebook_account": "SenatorBlunt",
             "youtube_account": "SenatorBlunt",
             "govtrack_id": "400034",
             "cspan_id": "45465",
             "votesmart_id": "418",
             "icpsr_id": "29735",
             "crp_id": "N00005195",
             "google_entity_id": "/m/034fn4",
             "url": "https://www.blunt.senate.gov/public",
             "rss_url": "http://www.blunt.senate.gov/public/?a=RSS.Feed",
             "contact_form": "https://www.blunt.senate.gov/public/index.cfm/contact-roy",
             "domain": null,
             "in_office": true,
             "dw_nominate": 0.431,
             "ideal_point": null,
             "seniority": "7",
             "next_election": "2022",
             "total_votes": 187,
             "missed_votes": 2,
             "total_present": 0,
             "ocd_id": "ocd-division/country:us/state:mo",
             "office": "260 Russell Senate Office Building",
             "phone": "202-224-5721",
             "fax": "202-224-8149",
             "state": "MO",
             "senate_class": "3",
             "state_rank": "junior",
             "lis_id": "S342"
             ,"missed_votes_pct": 1.07,
             "votes_with_party_pct": 99.46
           },

但是当我将其转换为CSV时,它只有两行(其中一列是列标题),可以延伸近4,000列。看起来像嵌套JSON的方式,这是我可以将其转换为CSV的唯一方法。希望它转换为CSV,以便我可以正确导入SQL。

我计算了标题,每个国会议员有39个。它们被制定为成员/ 0 / id,成员/ 0 / api_url等,直至成员/ 100 / id,成员/ 100 / api_url等。

无论如何,我可以在没有手动更改的情况下执行此操作吗?在一个理想的世界中,我将能够运行我的终端脚本,输出到CSV,然后上传到SQL以使用它。如果它最终是包含39列的100行,而不是一行和3,900列,那么它会工作得很好吗。

1 个答案:

答案 0 :(得分:0)

以下是使用jq

的解决方案

如果文件filter.jq包含

  .results[].members
| ( .[0] | keys ),
  ( .[] as $x | [ $x[] ] )
| @csv

,您的数据位于名为data.json然后

的文件中
jq -M -r -f filter.jq data.json

将生成

"api_uri","contact_form","crp_id","cspan_id","date_of_birth","domain","dw_nominate","facebook_account","fax","first_name","google_entity_id","govtrack_id","icpsr_id","id","ideal_point","in_office","last_name","leadership_role","lis_id","middle_name","missed_votes","missed_votes_pct","next_election","ocd_id","office","party","phone","rss_url","senate_class","seniority","state","state_rank","total_present","total_votes","twitter_account","url","votes_with_party_pct","votesmart_id","youtube_account"
"A000360","https://api.propublica.org/congress/v1/members/A000360.json","Lamar",,"Alexander","1940-07-03","R",,"SenAlexander","senatorlamaralexander","lamaralexander","300002","5","15691","40304","N00009888","/m/01rbs3","https://www.alexander.senate.gov/public","https://www.alexander.senate.gov/public/?a=RSS.Feed","http://www.alexander.senate.gov/public/index.cfm?p=Email",,true,0.323,,"15","2020",187,7,0,"ocd-division/country:us/state:tn","455 Dirksen Senate Office Building","202-224-4944","202-228-3398","TN","2","senior","S289",3.74,98.89
"B000575","https://api.propublica.org/congress/v1/members/B000575.json","Roy",,"Blunt","1950-01-10","R",,"RoyBlunt","SenatorBlunt","SenatorBlunt","400034","45465","418","29735","N00005195","/m/034fn4","https://www.blunt.senate.gov/public","http://www.blunt.senate.gov/public/?a=RSS.Feed","https://www.blunt.senate.gov/public/index.cfm/contact-roy",,true,0.431,,"7","2022",187,2,0,"ocd-division/country:us/state:mo","260 Russell Senate Office Building","202-224-5721","202-224-8149","MO","3","junior","S342",1.07,99.46