从JSONB字段正确提取JSON数组

时间:2019-06-25 14:27:54

标签: sql arrays json postgresql jsonb

从PostgreSQL 10的一个表中,我试图将同一jsonb字段的多个子级中的所有数组元素连接到其父行,有点像this questionthis one。但是我在JOIN中犯了一个错误,即我得到的不是包装单个数组元素,而是包装单个数组中的单个数组元素。

这是表格的缩写:

CREATE TABLE public.worker_customformstore (
    id integer NOT NULL DEFAULT nextval('worker_customformstore_id_seq'::regclass),
    created_on timestamp with time zone NOT NULL,
    store jsonb,
    schema_id integer NOT NULL,
    polymorphic_ctype_id integer,
    pdf_key character varying(100) COLLATE pg_catalog."default" NOT NULL,
    last_updated timestamp with time zone
)

以及store字段的示例值:

'{"Subcontractor Use": {
        "labor": [
            {
                "note": null,
                "hours": {
                    "dt": null,
                    "ot": null,
                    "st": 1,
                    "pdt": null,
                    "pot": null
                },
                "employee": {
                    "id": 456,
                    "trade": "XXX",
                    "is_active": true,
                    "last_name": "Uknow",
                    "first_name": "Noone",
                    "company_supplied_id": "456"
                },
                "external subcontractor": false
            },
            {
                "note": null,
                "hours": {
                    "dt": null,
                    "ot": null,
                    "st": 8,
                    "pdt": null,
                    "pot": null
                },
                "employee": {
                    "id": 123,
                    "trade": "",
                    "member": null,
                    "is_active": true,
                    "last_name": "Guy",
                    "user_role": "WORKER",
                    "first_name": "Some",
                    "company_supplied_id": "123"
                },
                "external subcontractor": false
            }
        ],
        "Equipment": [
            {
                "note": null,
                "hours": {
                    "idle": null,
                    "over": null,
                    "running": 8
                },
                "quantity": 1,
                "equipment": {
                    "id": 6243,
                    "status": "Rented",
                    "project": "8399",
                    "category": "XXXXX",
                    "caltrans_id": "00-20",
                    "description": "19",
                    "equipment_id": "Scissor",
                    "idle_time_price": 0,
                    "over_time_price": 0,
                    "running_time_price": 0
                }
            }
        ]
    }
}'

我的简化查询如下:

SELECT 
cufstore.id, 
CASE
    WHEN labor IS NOT DISTINCT FROM NULL THEN
    0
    WHEN (jsonb_array_elements(labor) -> 'hours' ->> 'st') = '' THEN
    0
    ELSE
    COALESCE((jsonb_array_elements(labor) -> 'hours' ->> 'st')::numeric, 0)
END
-- more stuff here ...
as total_hours,

CASE
    WHEN labor IS NOT DISTINCT FROM NULL THEN
    0
    ELSE
    COALESCE(jsonb_array_length(cufstore.store -> 'Subcontractor Use' -> 'labor'), 0)
END as total_workers,

labor, equipment

FROM public.worker_customformstore AS cufstore
...

LEFT OUTER JOIN LATERAL 
    (SELECT
        jsonb_array_elements(jsonb_strip_nulls(cufstore.store -> 'Subcontractor Use' -> 'labor'))
        WHERE cufstore.store -> 'Subcontractor Use' ->> 'labor' IS NOT NULL
    ) labor on true

LEFT OUTER JOIN LATERAL 
    (SELECT
        jsonb_array_elements(jsonb_strip_nulls(cufstore.store -> 'Subcontractor Use' -> 'Equipment'))
        WHERE cufstore.store -> 'Subcontractor Use' ->> 'Equipment' IS NOT NULL
    ) equipment on true

除了结束大量的冗余jsonb_array_elements调用之外,这些调用还阻止了我将重复的逻辑重构为一个函数,因为在此过程中,我在COALESCE中遇到有关集返回函数的错误。函数定义(尽管在我的查询主体中发生时没有任何抱怨)。

我认为我想要的更像是

LEFT OUTER JOIN LATERAL 
    jsonb_array_elements(jsonb_strip_nulls(cufstore.store -> 'Subcontractor Use' -> 'labor')) labor
    ON jsonb_typeof(labor) = 'array'

但是当数据为cannot extract elements from a scalar或看起来不正确时,尝试给我NULL

从根本上我可能会误解我可以做什么,但这就是equipment列的样子:

("{""hours"": {""running"": 8}, ""quantity"": 1, . . .}")

,我想问一下equipment -> 'hours' ->> 'running',而不必将其包装在jsonb_array_elements(equipment)中。我需要这样做还是在列值的开头和结尾不小心加了括号?

1 个答案:

答案 0 :(得分:1)

目前尚不清楚两个嵌套JSON数组"labor""Equipment"的元素如何关联。从您的样本中看来,"Equipment"似乎只有一个元素,而数组包装器只是杂讯...

不幸的是,还有一个嵌套键"equipment",很容易与另一个混淆。

我也不清楚目标是什么。

尽管如此,在消除了很多噪音和不必要的复杂性之后,这可能与您所追求的相近:

SELECT s.id
     , COALESCE((NULLIF(labor->'hours'->>'st', ''))::numeric, 0) AS total_hours
     , CASE WHEN labor IS NULL THEN 0
            ELSE COALESCE(jsonb_array_length(s.store->'Subcontractor Use'->'labor'), 0)
       END AS total_workers
     , s.store #>> '{Subcontractor Use, Equipment, 0, hours, running}' AS equipment_hours
     , labor
FROM   worker_customformstore s
LEFT   JOIN jsonb_array_elements(s.store->'Subcontractor Use'->'labor') labor ON true;

db <>提琴here

注释

这个冗长的表情:

CASE
    WHEN labor IS NOT DISTINCT FROM NULL THEN
    0
    WHEN (jsonb_array_elements(labor) -> 'hours' ->> 'st') = '' THEN
    0
    ELSE
    COALESCE((jsonb_array_elements(labor) -> 'hours' ->> 'st')::numeric, 0)
END

归结为:

COALESCE((NULLIF(labor -> 'hours' ->> 'st', ''))::numeric, 0)
  • 不要再次应用jsonb_array_elements(),这已经在横向子查询中完成了。

  • labor IS NOT DISTINCT FROM NULLlabor IS NULL相同,但是我们都不需要,因为后面的COALESCE还是这么做的。

  • 使用NULLIF根本不需要CASE带有另一个分支。

假定 ,嵌套JSON数组"Equipment"中只有一个元素,我们可以直接使用 equipment_hours。如果假设不成立,您将不得不做更多的事情(并做更多的解释)。


寻址your comment

如果 s.store #>> '{Subcontractor Use, Equipment, 0, hours, running}'不是嵌套的JSON数组,而是例如标量,则您将得到与注释相同的错误:

store -> 'Subcontractor Use' -> 'labor'

db <>提琴here

您可以避免使用嵌套ERROR: cannot extract elements from a scalar 这样的异常,例如:

CASE

db <>提琴here

您可能想做更多的事情来返回这种情况的替代值...