如何解析来自美国人口普查API调用的不规则JSON响应?

时间:2018-06-19 17:24:58

标签: json parsing jq census hjson

以下是来自美国人口普查API调用的响应。它在响应中使用不规则的JSON(或简单的javascript对象)格式。形式为 {一组数据} {第二组数据}。如何解析第二个集合,即第二个{}块中data:标记之后的数组数据。非常感谢。

{ 'Incorporated Places': 
   [ { OID: 27890266655573,
       STATE: '24',
       FUNCSTAT: 'A',
       NAME: 'Annapolis city',
       AREAWATER: 2344089,
       LSADC: '25',
       CENTLON: '-076.5039149',
       PLACECC: 'C1',
       BASENAME: 'Annapolis',
       INTPTLAT: '+38.9721517',
       MTFCC: 'G4110',
       PLACE: '01600',
       CBSAPCI: 'N',
       GEOID: '2401600',
       PLACENS: '02390562',
       CENTLAT: '+38.9706625',
       INTPTLON: '-076.5053318',
       NECTAPCI: 'N',
       AREALAND: 18668789,
       OBJECTID: 17981 } ],
  Counties: 
   [ { OID: 27590266629759,
       STATE: '24',
       FUNCSTAT: 'A',
       AREAWATER: 448136044,
       NAME: 'Anne Arundel County',
       LSADC: '06',
       CENTLON: '-076.5675544',
       BASENAME: 'Anne Arundel',
       INTPTLAT: '+38.9916103',
       COUNTYCC: 'H1',
       MTFCC: 'G4020',
       COUNTY: '003',
       GEOID: '24003',
       CENTLAT: '+38.9939578',
       INTPTLON: '-076.5607311',
       AREALAND: 1074250699,
       COUNTYNS: '01710958',
       OBJECTID: 1026 } ],
  'Census Tracts': 
   [ { OID: 20790266676100,
       STATE: '24',
       FUNCSTAT: 'S',
       NAME: 'Census Tract 7061.01',
       AREAWATER: 249211,
       LSADC: 'CT',
       CENTLON: '-076.4923901',
       BASENAME: '7061.01',
       INTPTLAT: '+38.9772243',
       MTFCC: 'G5020',
       COUNTY: '003',
       GEOID: '24003706101',
       CENTLAT: '+38.9778105',
       INTPTLON: '-076.4922890',
       AREALAND: 1216538,
       OBJECTID: 9052,
       TRACT: '706101' } ],
  '2010 Census Blocks': 
   [ { BLKGRP: '2',
       OID: 210404225273924,
       FUNCSTAT: 'S',
       STATE: '24',
       AREAWATER: 0,
       NAME: 'Block 2014',
       SUFFIX: '',
       LSADC: 'BK',
       CENTLON: '-076.4906778',
       LWBLKTYP: 'L',
       BASENAME: '2014',
       BLOCK: '2014',
       INTPTLAT: '+38.9788709',
       MTFCC: 'G5040',
       COUNTY: '003',
       GEOID: '240037061012014',
       CENTLAT: '+38.9788709',
       INTPTLON: '-076.4906778',
       AREALAND: 15203,
       OBJECTID: 3877566,
       TRACT: '706101' } ],
  States: 
   [ { OID: 27490140608205,
       STATE: '24',
       FUNCSTAT: 'A',
       NAME: 'Maryland',
       AREAWATER: 6980364276,
       LSADC: '00',
       CENTLON: '-076.6789663',
       STUSAB: 'MD',
       BASENAME: 'Maryland',
       INTPTLAT: '+38.9466584',
       DIVISION: '5',
       MTFCC: 'G4000',
       STATENS: '01714934',
       GEOID: '24',
       CENTLAT: '+38.9463607',
       INTPTLON: '-076.6744939',
       REGION: '3',
       AREALAND: 25150702890,
       OBJECTID: 5 } ] }
{ level: 'state',
  sublevel: true,
  variables: [ 'income', 'population', 'C24010_011E', 'education_masters' ],
  year: 2015,
  api: 'acs5',
  lat: 38.9786,
  lng: -76.4911,
  state: '24',
  county: '003',
  tract: '706101',
  blockGroup: '2',
  place: '01600',
  place_name: 'Annapolis city',
  data: 
   [ { name: 'Allegany County, Maryland',
       state: '24',
       county: '001',
       income: '40551',
       population: '73549',
       C24010_011E: '1039',
       education_masters: '3255' },
     { name: 'Anne Arundel County, Maryland',
       state: '24',
       county: '003',
       income: '89860',
       population: '555280',
       C24010_011E: '9306',
       education_masters: '45047' },
     { name: 'Baltimore County, Maryland',
       state: '24',
       county: '005',
       income: '67095',
       population: '822959',
       C24010_011E: '16725',
       education_masters: '60803' },
     { name: 'Calvert County, Maryland',
       state: '24',
       county: '009',
       income: '95828',
       population: '90114',
       C24010_011E: '986',
       education_masters: '5628' },
     { name: 'Caroline County, Maryland',
       state: '24',
       county: '011',
       income: '52465',
       population: '32661',
       C24010_011E: '328',
       education_masters: '1011' },
     { name: 'Carroll County, Maryland',
       state: '24',
       county: '013',
       income: '85385',
       population: '167444',
       C24010_011E: '2977',
       education_masters: '11030' },
     { name: 'Cecil County, Maryland',
       state: '24',
       county: '015',
       income: '66396',
       population: '101960',
       C24010_011E: '1421',
       education_masters: '4239' },
     { name: 'Charles County, Maryland',
       state: '24',
       county: '017',
       income: '90607',
       population: '152754',
       C24010_011E: '2367',
       education_masters: '8300' },
     { name: 'Dorchester County, Maryland',
       state: '24',
       county: '019',
       income: '47093',
       population: '32534',
       C24010_011E: '461',
       education_masters: '1349' },
     { name: 'Frederick County, Maryland',
       state: '24',
       county: '021',
       income: '83700',
       population: '241373',
       C24010_011E: '5026',
       education_masters: '18962' },
     { name: 'Garrett County, Maryland',
       state: '24',
       county: '023',
       income: '45432',
       population: '29813',
       C24010_011E: '344',
       education_masters: '1434' },
     { name: 'Harford County, Maryland',
       state: '24',
       county: '025',
       income: '80465',
       population: '248966',
       C24010_011E: '4243',
       education_masters: '18132' },
     { name: 'Howard County, Maryland',
       state: '24',
       county: '027',
       income: '110238',
       population: '304115',
       C24010_011E: '7687',
       education_masters: '42253' },
     { name: 'Kent County, Maryland',
       state: '24',
       county: '029',
       income: '58147',
       population: '19923',
       C24010_011E: '261',
       education_masters: '1189' },
     { name: 'Montgomery County, Maryland',
       state: '24',
       county: '031',
       income: '99435',
       population: '1017859',
       C24010_011E: '33271',
       education_masters: '132184' },
     { name: 'Prince George\'s County, Maryland',
       state: '24',
       county: '033',
       income: '74260',
       population: '892816',
       C24010_011E: '16164',
       education_masters: '58017' },
     { name: 'Queen Anne\'s County, Maryland',
       state: '24',
       county: '035',
       income: '85963',
       population: '48600',
       C24010_011E: '876',
       education_masters: '3419' },
     { name: 'St. Mary\'s County, Maryland',
       state: '24',
       county: '037',
       income: '86987',
       population: '109614',
       C24010_011E: '1625',
       education_masters: '7640' },
     { name: 'Somerset County, Maryland',
       state: '24',
       county: '039',
       income: '35154',
       population: '25980',
       C24010_011E: '208',
       education_masters: '715' },
     { name: 'Talbot County, Maryland',
       state: '24',
       county: '041',
       income: '58228',
       population: '37799',
       C24010_011E: '711',
       education_masters: '2855' },
     { name: 'Washington County, Maryland',
       state: '24',
       county: '043',
       income: '56228',
       population: '149270',
       C24010_011E: '1861',
       education_masters: '6313' },
     { name: 'Wicomico County, Maryland',
       state: '24',
       county: '045',
       income: '52278',
       population: '101182',
       C24010_011E: '1788',
       education_masters: '5466' },
     { name: 'Worcester County, Maryland',
       state: '24',
       county: '047',
       income: '56773',
       population: '51519',
       C24010_011E: '781',
       education_masters: '3110' },
     { name: 'Baltimore city, Maryland',
       state: '24',
       county: '510',
       income: '42241',
       population: '622454',
       C24010_011E: '15642',
       education_masters: '36590' } ] }

1 个答案:

答案 0 :(得分:0)

好消息是,诸如hjson(http://hjson.org/users.html),any-json(https://www.npmjs.com/package/any-json)和 json5(npm安装json5) 可以处理两个顶级的准JSON对象,例如

hjson -j census.2.txt

或:

any-json --input-format=cson census.2.txt

或:

json5 -c census.2.txt

不好的消息是这三个工具一次只能处理一个顶级对象。

选项1:拆分

所以一种可能性是将响应分为两部分。 如果可以假设只有顶级对象,这将很容易 在第1列中有一个大括号({)。在此假设下,存在数十种可能性, 例如使用split -p '^{'

选项2:hjson

另一种可能性是将响应包装在方括号中,然后使用hjson将伪数组转换为有效的JSON。

因此,如果只希望第二个对象(转换为有效的JSON),则可以运行:

(echo "["; cat census.txt; echo "]") | hjson -j | jq '.[1]'

这使用了奇妙的jq实用程序,但是还有其他一些工具也可以完成这项工作。