将多个JSON列解析到表

时间:2019-06-12 11:19:58

标签: sql json postgresql

我有一个包含30列的表,其中一些是JSON数组,现在我正在手动进行解析,但是我在考虑是否有动态的方式来进行解析。

表就像

| flight.atfcmMeasureLocations.item | flight.ctfmAirspaceProfile | flight.ctfmPointProfile |   |   |
|-----------------------------------|----------------------------|-------------------------|---|---|
| null                              | [{...}]                    | []                      |   |   |
| null                              | []                         | [{...},{...}]           |   |   |
| [{...},{...}]                     | [{...}]                    | [{...}, {...}]          |   |   |

现在我正在像这样解析它,但是我认为这太繁琐了,它必须是使该过程自动化的一种方法。

SELECT field::jsonb -> 'FlightAtfcmMcdmOnlyLocation'                                                                as flightAtfcmMcdmOnlyLocation,
       field::jsonb -> 'FlightAtfcmRegulationLocation' ->> 'hotspotId'                                              as flightAtfcmRegulationLocation_hotspotId,
       field::jsonb -> 'FlightAtfcmRegulationLocation' ->> 'mcdmState'                                              as flightAtfcmRegulationLocation_mcdmState,
       field::jsonb -> 'FlightAtfcmRegulationLocation' ->> 'measureSubType'                                         as flightAtfcmRegulationLocation_measureSubType,
       field::jsonb -> 'FlightAtfcmRegulationLocation' ->> 'referenceLocation-ReferenceLocationAerodrome'            as flightAtfcmRegulationLocation_referenceLocationReferenceLocationAerodrome,
       field::jsonb -> 'FlightAtfcmRegulationLocation' ->> 'referenceLocation-ReferenceLocationAerodromeSet'         as flightAtfcmRegulationLocation_referenceLocationReferenceLocationAerodromeSet,
       field::jsonb -> 'FlightAtfcmRegulationLocation' -> 'referenceLocation-ReferenceLocationAirspace' ->> 'id'   as flightAtfcmRegulationLocation_referenceLocationReferenceLocationAirspace_id,
       field::jsonb -> 'FlightAtfcmRegulationLocation' -> 'referenceLocation-ReferenceLocationAirspace' ->> 'type' as flightAtfcmRegulationLocation_referenceLocationReferenceLocationAirspace_type,
       field::jsonb -> 'FlightAtfcmRegulationLocation' ->> 'referenceLocation-ReferenceLocationDBEPoint'            as FlightAtfcmRegulationLocation_referenceLocationReferenceLocationDBEPoint,
       field::jsonb -> 'FlightAtfcmRegulationLocation' ->> 'referenceLocation-ReferenceLocationPublishedPoint'      as FlightAtfcmRegulationLocation_referenceLocationReferenceLocationPublishedPoint,
       field::jsonb -> 'FlightAtfcmRegulationLocation' ->> 'regulationId'                                           as FlightAtfcmRegulationLocation_regulationId,
       field::jsonb -> 'FlightAtfcmRegulationLocation' ->> 'toConfirm'                                              as FlightAtfcmRegulationLocation_toConfirm,
       field::jsonb -> 'FlightAtfcmReroutingLocation'                                                               as FlightAtfcmReroutingLocation

FROM (Select json_array_elements(case
                                     when ("flight.atfcmMeasureLocations.item"::text = '[]' OR
                                           "flight.atfcmMeasureLocations.item"::text = 'null') then '[null]'::json
                                     else "flight.atfcmMeasureLocations.item" end) field
      from eurocontrol_data) as json;

SELECT field::jsonb -> 'referenceLocation-ReferenceLocationAerodrome'           as referenceLocationReferenceLocationAerodrome,
       field::jsonb -> 'referenceLocation-ReferenceLocationAerodromeSet'        as referenceLocationReferenceLocationAerodromeSet,
       field::jsonb -> 'referenceLocation-ReferenceLocationAirspace' ->> 'id'   as referenceLocationReferenceLocationAirspace_id,
       field::jsonb -> 'referenceLocation-ReferenceLocationAirspace' ->> 'type' as referenceLocationReferenceLocationAirspace_type,

       field::jsonb -> 'referenceLocation-ReferenceLocationDBEPoint'            as referenceLocationReferenceLocationDBEPoint,
       field::jsonb -> 'referenceLocation-ReferenceLocationPublishedPoint'      as referenceLocationReferenceLocationPublishedPoint,
       field::jsonb -> 'regulationId'                                           as regulationId,
       field::jsonb -> 'toConfirm'                                              as toConfirm

FROM (Select json_array_elements(case
                                     when "flight.regulationLocations"::text = '[]' then '[null]'::json
                                     else "flight.regulationLocations" end) field
      from eurocontrol_data) as json;

我的主要目标是将json解析为表,并在旧单元格和新表之间建立某种关系,以便在主表和新表之间建立关系(可能带有索引),该过程类似于以下内容:

| flight.atfcmMeasureLocations.item | flight.ctfmAirspaceProfile | flight.ctfmPointProfile |   |   |
|-----------------------------------|----------------------------|-------------------------|---|---|
| null                              | 1                          | []                      |   |   |
| null                              | []                         | 1                       |   |   |
| 1                                 | 2                          | 2                       |   |   |

虽然说实话,我是一个新手,所以不确定如何做到这一点的正确方法。

1 个答案:

答案 0 :(得分:0)

我最近遇到了类似的问题,因此决定编写一个python库来自动完成这项工作。这是库的链接:https://github.com/zolekode/json-to-tables

为完整起见,让我从此处的自述文件中复制示例:

假设这是您的JSON文件:

[
    {
        "name": "truck",
        "brand": "BMW",
        "num_wheels": 4,
        "engine": {
            "brand": "RR",
            "date_of_production": {
                "day": 3,
                "month": "Feb",
                "year": 1990
            },
            "creators": ["Sandy", "Leslie", "Kane"]
        }
    },
    {
        "name": "bike",
        "num_wheels": 2,
        "top_speed": "100Km/hr",
        "engine": {
            "brand": "Audi",
            "date_of_production": {
                "day": 2,
                "month": "Sep",
                "year": 2002
            },
            "creators": ["Anabel", {"GreenMotors": {"CEO": "Charles Green"}}]
        }
    },
]
  1. 首先,您加载JSON字符串。 automobiles = json.loads(automobiles)

  2. 然后运行以下代码:

extent_table = ExtentTable()
table_maker = TableMaker(extent_table) 
root_table_name = "automobiles" # could use any other name
table_maker.convert_json_objects_to_tables(automobiles, root_table_name)
table_maker.show_tables(num_elements=5)

  1. 导出或可视化结果:
SHOWING TABLES :D


automobiles
   ID   name brand num_wheels engine top_speed
0   0  truck   BMW          4      0      None
1   1   bike  None          2      1  100Km/hr
2   2   None  None       None   None      None
____________________________________________________

engine
   ID brand date_of_production
0   0    RR                  0
1   1  Audi                  1
2   2  None               None
____________________________________________________

date_of_production
   ID   day month  year
0   0     3   Feb  1990
1   1     2   Sep  2002
2   2  None  None  None
____________________________________________________

engine_?_creators
  ID PARENT_ID is_scalar  scalar
0  0         0      True   Sandy
1  1         0      True  Leslie
2  2         0      True    Kane
3  3         1      True  Anabel
4  4         1     False    None
____________________________________________________

GreenMotors
   ID            CEO
0   0  Charles Green
1   1           None
____________________________________________________


engine_?_creators_$_GreenMotors
   ID GreenMotors PARENT_ID
0   0           0         4
1   1        None      None
____________________________________________________

提示:尝试使属性名称尽可能唯一。该解决方案适用于我的项目。如果发现任何错误或对改进的想法,则可以创建请求请求。脚本完成后,有时会创建带有Empty / None值的最后一行(除ID以外,其他所有值都是None或空字符串)。您不应该使用这些行。它们不方便处理。

可在此处找到完整的示例:https://github.com/zolekode/json-to-tables/blob/master/example.py