当我使用ProPublica的API时,我可以通过终端使用以下方式获得第115届国会议员名单:
curl "https://api.propublica.org/congress/v1/115/senate/members.json" -H "X-API-Key: "MY_API_KEY"
我得到一个JSON响应,如下所示:
{
"status":"OK",
"copyright":" Copyright (c) 2017 Pro Publica Inc. All Rights Reserved.",
"results":[
{
"congress": "115",
"chamber": "Senate",
"num_results": 101,
"offset": 0,
"members": [
{
"id": "A000360",
"api_uri":"https://api.propublica.org/congress/v1/members/A000360.json",
"first_name": "Lamar",
"middle_name": null,
"last_name": "Alexander",
"date_of_birth": "1940-07-03",
"party": "R",
"leadership_role": null,
"twitter_account": "SenAlexander",
"facebook_account": "senatorlamaralexander",
"youtube_account": "lamaralexander",
"govtrack_id": "300002",
"cspan_id": "5",
"votesmart_id": "15691",
"icpsr_id": "40304",
"crp_id": "N00009888",
"google_entity_id": "/m/01rbs3",
"url": "https://www.alexander.senate.gov/public",
"rss_url": "https://www.alexander.senate.gov/public/?a=RSS.Feed",
"contact_form": "http://www.alexander.senate.gov/public/index.cfm?p=Email",
"domain": null,
"in_office": true,
"dw_nominate": 0.323,
"ideal_point": null,
"seniority": "15",
"next_election": "2020",
"total_votes": 187,
"missed_votes": 7,
"total_present": 0,
"ocd_id": "ocd-division/country:us/state:tn",
"office": "455 Dirksen Senate Office Building",
"phone": "202-224-4944",
"fax": "202-228-3398",
"state": "TN",
"senate_class": "2",
"state_rank": "senior",
"lis_id": "S289"
,"missed_votes_pct": 3.74,
"votes_with_party_pct": 98.89
},
{
"id": "B000575",
"api_uri":"https://api.propublica.org/congress/v1/members/B000575.json",
"first_name": "Roy",
"middle_name": null,
"last_name": "Blunt",
"date_of_birth": "1950-01-10",
"party": "R",
"leadership_role": null,
"twitter_account": "RoyBlunt",
"facebook_account": "SenatorBlunt",
"youtube_account": "SenatorBlunt",
"govtrack_id": "400034",
"cspan_id": "45465",
"votesmart_id": "418",
"icpsr_id": "29735",
"crp_id": "N00005195",
"google_entity_id": "/m/034fn4",
"url": "https://www.blunt.senate.gov/public",
"rss_url": "http://www.blunt.senate.gov/public/?a=RSS.Feed",
"contact_form": "https://www.blunt.senate.gov/public/index.cfm/contact-roy",
"domain": null,
"in_office": true,
"dw_nominate": 0.431,
"ideal_point": null,
"seniority": "7",
"next_election": "2022",
"total_votes": 187,
"missed_votes": 2,
"total_present": 0,
"ocd_id": "ocd-division/country:us/state:mo",
"office": "260 Russell Senate Office Building",
"phone": "202-224-5721",
"fax": "202-224-8149",
"state": "MO",
"senate_class": "3",
"state_rank": "junior",
"lis_id": "S342"
,"missed_votes_pct": 1.07,
"votes_with_party_pct": 99.46
},
等
但是当我将其转换为CSV时,它只有两行(其中一列是列标题),可以延伸近4,000列。看起来像嵌套JSON的方式,这是我可以将其转换为CSV的唯一方法。希望它转换为CSV,以便我可以正确导入SQL。
我计算了标题,每个国会议员有39个。它们被制定为成员/ 0 / id,成员/ 0 / api_url等,直至成员/ 100 / id,成员/ 100 / api_url等。
无论如何,我可以在没有手动更改的情况下执行此操作吗?在一个理想的世界中,我将能够运行我的终端脚本,输出到CSV,然后上传到SQL以使用它。如果它最终是包含39列的100行,而不是一行和3,900列,那么它会工作得很好吗。
答案 0 :(得分:0)
以下是使用jq
的解决方案如果文件filter.jq
包含
.results[].members
| ( .[0] | keys ),
( .[] as $x | [ $x[] ] )
| @csv
,您的数据位于名为data.json
然后
jq -M -r -f filter.jq data.json
将生成
"api_uri","contact_form","crp_id","cspan_id","date_of_birth","domain","dw_nominate","facebook_account","fax","first_name","google_entity_id","govtrack_id","icpsr_id","id","ideal_point","in_office","last_name","leadership_role","lis_id","middle_name","missed_votes","missed_votes_pct","next_election","ocd_id","office","party","phone","rss_url","senate_class","seniority","state","state_rank","total_present","total_votes","twitter_account","url","votes_with_party_pct","votesmart_id","youtube_account"
"A000360","https://api.propublica.org/congress/v1/members/A000360.json","Lamar",,"Alexander","1940-07-03","R",,"SenAlexander","senatorlamaralexander","lamaralexander","300002","5","15691","40304","N00009888","/m/01rbs3","https://www.alexander.senate.gov/public","https://www.alexander.senate.gov/public/?a=RSS.Feed","http://www.alexander.senate.gov/public/index.cfm?p=Email",,true,0.323,,"15","2020",187,7,0,"ocd-division/country:us/state:tn","455 Dirksen Senate Office Building","202-224-4944","202-228-3398","TN","2","senior","S289",3.74,98.89
"B000575","https://api.propublica.org/congress/v1/members/B000575.json","Roy",,"Blunt","1950-01-10","R",,"RoyBlunt","SenatorBlunt","SenatorBlunt","400034","45465","418","29735","N00005195","/m/034fn4","https://www.blunt.senate.gov/public","http://www.blunt.senate.gov/public/?a=RSS.Feed","https://www.blunt.senate.gov/public/index.cfm/contact-roy",,true,0.431,,"7","2022",187,2,0,"ocd-division/country:us/state:mo","260 Russell Senate Office Building","202-224-5721","202-224-8149","MO","3","junior","S342",1.07,99.46