我有一个包含30列的表,其中一些是JSON数组,现在我正在手动进行解析,但是我在考虑是否有动态的方式来进行解析。
表就像
| flight.atfcmMeasureLocations.item | flight.ctfmAirspaceProfile | flight.ctfmPointProfile | | |
|-----------------------------------|----------------------------|-------------------------|---|---|
| null | [{...}] | [] | | |
| null | [] | [{...},{...}] | | |
| [{...},{...}] | [{...}] | [{...}, {...}] | | |
现在我正在像这样解析它,但是我认为这太繁琐了,它必须是使该过程自动化的一种方法。
SELECT field::jsonb -> 'FlightAtfcmMcdmOnlyLocation' as flightAtfcmMcdmOnlyLocation,
field::jsonb -> 'FlightAtfcmRegulationLocation' ->> 'hotspotId' as flightAtfcmRegulationLocation_hotspotId,
field::jsonb -> 'FlightAtfcmRegulationLocation' ->> 'mcdmState' as flightAtfcmRegulationLocation_mcdmState,
field::jsonb -> 'FlightAtfcmRegulationLocation' ->> 'measureSubType' as flightAtfcmRegulationLocation_measureSubType,
field::jsonb -> 'FlightAtfcmRegulationLocation' ->> 'referenceLocation-ReferenceLocationAerodrome' as flightAtfcmRegulationLocation_referenceLocationReferenceLocationAerodrome,
field::jsonb -> 'FlightAtfcmRegulationLocation' ->> 'referenceLocation-ReferenceLocationAerodromeSet' as flightAtfcmRegulationLocation_referenceLocationReferenceLocationAerodromeSet,
field::jsonb -> 'FlightAtfcmRegulationLocation' -> 'referenceLocation-ReferenceLocationAirspace' ->> 'id' as flightAtfcmRegulationLocation_referenceLocationReferenceLocationAirspace_id,
field::jsonb -> 'FlightAtfcmRegulationLocation' -> 'referenceLocation-ReferenceLocationAirspace' ->> 'type' as flightAtfcmRegulationLocation_referenceLocationReferenceLocationAirspace_type,
field::jsonb -> 'FlightAtfcmRegulationLocation' ->> 'referenceLocation-ReferenceLocationDBEPoint' as FlightAtfcmRegulationLocation_referenceLocationReferenceLocationDBEPoint,
field::jsonb -> 'FlightAtfcmRegulationLocation' ->> 'referenceLocation-ReferenceLocationPublishedPoint' as FlightAtfcmRegulationLocation_referenceLocationReferenceLocationPublishedPoint,
field::jsonb -> 'FlightAtfcmRegulationLocation' ->> 'regulationId' as FlightAtfcmRegulationLocation_regulationId,
field::jsonb -> 'FlightAtfcmRegulationLocation' ->> 'toConfirm' as FlightAtfcmRegulationLocation_toConfirm,
field::jsonb -> 'FlightAtfcmReroutingLocation' as FlightAtfcmReroutingLocation
FROM (Select json_array_elements(case
when ("flight.atfcmMeasureLocations.item"::text = '[]' OR
"flight.atfcmMeasureLocations.item"::text = 'null') then '[null]'::json
else "flight.atfcmMeasureLocations.item" end) field
from eurocontrol_data) as json;
SELECT field::jsonb -> 'referenceLocation-ReferenceLocationAerodrome' as referenceLocationReferenceLocationAerodrome,
field::jsonb -> 'referenceLocation-ReferenceLocationAerodromeSet' as referenceLocationReferenceLocationAerodromeSet,
field::jsonb -> 'referenceLocation-ReferenceLocationAirspace' ->> 'id' as referenceLocationReferenceLocationAirspace_id,
field::jsonb -> 'referenceLocation-ReferenceLocationAirspace' ->> 'type' as referenceLocationReferenceLocationAirspace_type,
field::jsonb -> 'referenceLocation-ReferenceLocationDBEPoint' as referenceLocationReferenceLocationDBEPoint,
field::jsonb -> 'referenceLocation-ReferenceLocationPublishedPoint' as referenceLocationReferenceLocationPublishedPoint,
field::jsonb -> 'regulationId' as regulationId,
field::jsonb -> 'toConfirm' as toConfirm
FROM (Select json_array_elements(case
when "flight.regulationLocations"::text = '[]' then '[null]'::json
else "flight.regulationLocations" end) field
from eurocontrol_data) as json;
我的主要目标是将json解析为表,并在旧单元格和新表之间建立某种关系,以便在主表和新表之间建立关系(可能带有索引),该过程类似于以下内容:
| flight.atfcmMeasureLocations.item | flight.ctfmAirspaceProfile | flight.ctfmPointProfile | | |
|-----------------------------------|----------------------------|-------------------------|---|---|
| null | 1 | [] | | |
| null | [] | 1 | | |
| 1 | 2 | 2 | | |
虽然说实话,我是一个新手,所以不确定如何做到这一点的正确方法。
答案 0 :(得分:0)
我最近遇到了类似的问题,因此决定编写一个python库来自动完成这项工作。这是库的链接:https://github.com/zolekode/json-to-tables。
为完整起见,让我从此处的自述文件中复制示例:
假设这是您的JSON文件:
[
{
"name": "truck",
"brand": "BMW",
"num_wheels": 4,
"engine": {
"brand": "RR",
"date_of_production": {
"day": 3,
"month": "Feb",
"year": 1990
},
"creators": ["Sandy", "Leslie", "Kane"]
}
},
{
"name": "bike",
"num_wheels": 2,
"top_speed": "100Km/hr",
"engine": {
"brand": "Audi",
"date_of_production": {
"day": 2,
"month": "Sep",
"year": 2002
},
"creators": ["Anabel", {"GreenMotors": {"CEO": "Charles Green"}}]
}
},
]
首先,您加载JSON字符串。
automobiles = json.loads(automobiles)
然后运行以下代码:
extent_table = ExtentTable()
table_maker = TableMaker(extent_table)
root_table_name = "automobiles" # could use any other name
table_maker.convert_json_objects_to_tables(automobiles, root_table_name)
table_maker.show_tables(num_elements=5)
SHOWING TABLES :D
automobiles
ID name brand num_wheels engine top_speed
0 0 truck BMW 4 0 None
1 1 bike None 2 1 100Km/hr
2 2 None None None None None
____________________________________________________
engine
ID brand date_of_production
0 0 RR 0
1 1 Audi 1
2 2 None None
____________________________________________________
date_of_production
ID day month year
0 0 3 Feb 1990
1 1 2 Sep 2002
2 2 None None None
____________________________________________________
engine_?_creators
ID PARENT_ID is_scalar scalar
0 0 0 True Sandy
1 1 0 True Leslie
2 2 0 True Kane
3 3 1 True Anabel
4 4 1 False None
____________________________________________________
GreenMotors
ID CEO
0 0 Charles Green
1 1 None
____________________________________________________
engine_?_creators_$_GreenMotors
ID GreenMotors PARENT_ID
0 0 0 4
1 1 None None
____________________________________________________
提示:尝试使属性名称尽可能唯一。该解决方案适用于我的项目。如果发现任何错误或对改进的想法,则可以创建请求请求。脚本完成后,有时会创建带有Empty / None值的最后一行(除ID以外,其他所有值都是None或空字符串)。您不应该使用这些行。它们不方便处理。
可在此处找到完整的示例:https://github.com/zolekode/json-to-tables/blob/master/example.py。