如何规范化嵌套的 .json?

时间:2021-07-23 22:29:58

标签: python json pandas normalize

所以我使用 Mapbox Web API 并返回了一个 .json。我在解析 .jsons 时遇到了麻烦和困难。我面临的挑战之一是返回的 .json 是嵌套的。这是 .json:

{
   "type":"FeatureCollection",
   "query":[
      -73.989,
      40.733
   ],
   "features":[
      {
         "id":"locality.12696928000137850",
         "type":"Feature",
         "place_type":[
            "locality"
         ],
         "relevance":1,
         "properties":{
            "wikidata":"Q11299"
         },
         "text":"Manhattan",
         "place_name":"Manhattan, New York, United States",
         "bbox":[
            -74.047313153061,
            40.679573,
            -73.907,
            40.8820749648427
         ],
         "center":[
            -73.9597,
            40.7903
         ],
         "geometry":{
            "type":"Point",
            "coordinates":[
               -73.9597,
               40.7903
            ]
         },
         "context":[
            {
               "id":"place.2618194975964500",
               "wikidata":"Q60",
               "text":"New York"
            },
            {
               "id":"district.12113562209855570",
               "wikidata":"Q500416",
               "text":"New York County"
            },
            {
               "id":"region.17349986251855570",
               "wikidata":"Q1384",
               "short_code":"US-NY",
               "text":"New York"
            },
            {
               "id":"country.19678805456372290",
               "wikidata":"Q30",
               "short_code":"us",
               "text":"United States"
            }
         ]
      },
      {
         "id":"region.17349986251855570",
         "type":"Feature",
         "place_type":[
            "region"
         ],
         "relevance":1,
         "properties":{
            "wikidata":"Q1384",
            "short_code":"US-NY"
         },
         "text":"New York",
         "place_name":"New York, United
States",
         "bbox":[
            -79.8578350999901,
            40.4771391062446,
            -71.7564918092633,
            45.0239286969073
         ],
         "center":[
            -75.4652471468304,
            42.751210955
         ],
         "geometry":{
            "type":"Point",
            "coordinates":[
               -75.4652471468304,
               42.751210955
            ]
         },
         "context":[
            {
               "id":"country.19678805456372290",
               "wikidata":"Q30",
               "short_code":"us",
               "text":"United States"
            }
         ]
      },
      {
         "id":"country.19678805456372290",
         "type":"Feature",
         "place_type":[
            "country"
         ],
         "relevance":1,
         "properties":{
            "wikidata":"Q30",
            "short_code":"us"
         },
         "text":"United States",
         "place_name":"United States",
         "bbox":[
            -179.9,
            18.8163608007951,
            -66.8847646185949,
            71.4202919997506
         ],
         "center":[
            -97.9222112121185,
            39.3812661305678
         ],
         "geometry":{
            "type":"Point",
            "coordinates":[
               -97.9222112121185,
               39.3812661305678
            ]
         }
      }
   ],
   "attribution":"NOTICE: © 2021 Mapbox and its suppliers. All
rights reserved. Use of this data is subject to the Mapbox Terms of Service
(https://www.mapbox.com/about/maps/). This response and the information it contains may not be
retained. POI(s) provided by Foursquare."
}

我能够使用以下代码片段将其加载到数据帧中:

url = "https://api.mapbox.com/geocoding/v5/mapbox.places/-73.989,40.733.json?
types=country,region,locality&access_token=MY_KEY_HERE"

data = json.loads(requests.get(url).text)

df = json_normalize(data, 'features')

return df

但是,我发现我需要向其中添加 [query],因此我将相关药水修改为如下所示:

url = "https://api.mapbox.com/geocoding/v5/mapbox.places/-73.989,40.733.json?
types=country,region,locality&access_token=MY_KEY_HERE"

data = json.loads(requests.get(url).text)

df = json_normalize(data, 'features', ['query'])

return df

(我遵循的语法来自 documentation

我得到的错误说明:

<块引用>

ValueError: 值的长度与索引的长度不匹配。

查询字段看起来像这样...

Query field needs to be added to the dataframe

我不确定错误说明了什么以及如何解决它。

这是我想要的输出数据帧: Desired Output

我可以清理和删除不需要的字段,但无法显示 [query] 字段。

1 个答案:

答案 0 :(得分:1)

query 之后添加列 json_normalize

df.insert(0, 'query', [data['query']] * len(df))