如何将嵌套的 json 转换为数据框?

时间:2021-02-14 06:26:05

标签: python json pandas

{
    "RecordSet": {
        "geometryType": "esriGeometryPoint",
        "spatialReference": {
            "wkid": "DLT0"
        },
        "features": [{
                "geometry": {
                    "x": 4941.9900000002,
                    "y": 27766.2190000005,
                    "spatialReference": {
                        "wkid": "DLT0"
                    }
                },
                "attributes": {
                    "COMMUNITY_": 35,
                    "NAME_ENG": "FOUR POINTS ,
                    "LATITUDE": 25.1404585,
                    "LONGITUDE": 55.2599731,
                    "MNUM": "261 89753"
                }
            },
            {
                "geometry": {
                    "x": 5080.0719999997,
                    "y": 23025.9379999992,
                    "spatialReference": {
                        "wkid": "DLT0"
                    }
                },
                "attributes": {
                    "COMMUNITY_": 12,
                    "NAME_ENG": "PARK  - PORT ",
                    "LATITUDE": 25.24831,
                    "LONGITUDE": 55.312064,
                    "MNUM": "321445"
                }
            }
        ]
    }
}

代码:

Try1. pd.DataFrame(sum(json.load(open('test.json')), [])

Try2. df = pd.concat([pd.DataFrame(x) for x in data], ignore_index=False)

Try3. df = pd.json_normalize(data)
      df.columns = df.columns.map(lambda x: x.split(".")[-1])
      df

Try4: test = pd.json_normalize(data)

enter image description here

试图获取列 - x,y,spatialRef,COMMUNITY_,NAME_ENG,LATITUDE,LONGITUDE, MNUM。 我查看了有关此主题的其他问题,尝试了各种方法将 JSON 文件加载到 Pandas 中。请让我如何找到最佳解决方案

1 个答案:

答案 0 :(得分:2)

  • 最简单的是一次使用 json_normalize() 来引用 list
  • 更复杂,你可以扩展一切
js = {'RecordSet': {'geometryType': 'esriGeometryPoint',
  'spatialReference': {'wkid': 'DLT0'},
  'features': [{'geometry': {'x': 4941.9900000002,
     'y': 27766.2190000005,
     'spatialReference': {'wkid': 'DLT0'}},
    'attributes': {'COMMUNITY_': 35,
     'NAME_ENG': 'FOUR POINTS',
     'LATITUDE': 25.1404585,
     'LONGITUDE': 55.2599731,
     'MNUM': '261 89753'}},
   {'geometry': {'x': 5080.0719999997,
     'y': 23025.9379999992,
     'spatialReference': {'wkid': 'DLT0'}},
    'attributes': {'COMMUNITY_': 12,
     'NAME_ENG': 'PARK  - PORT ',
     'LATITUDE': 25.24831,
     'LONGITUDE': 55.312064,
     'MNUM': '321445'}}]}}

# siumplest
pd.json_normalize(js["RecordSet"]["features"])

# complete
df = pd.json_normalize(js, record_path=[["RecordSet","features"]], meta="RecordSet")
df = df.join(df["RecordSet"].apply(pd.Series)).drop(columns=["RecordSet","features"])
df.join(df["spatialReference"].apply(pd.Series)).drop(columns=["spatialReference"])