在PostgreSQL中扁平化嵌套的JSON结构

时间:2018-08-14 23:06:43

标签: json postgresql object nested

我正在尝试编写一个Postgres查询,该查询将以特定格式输出我的json数据。

JSON数据结构

{
    user_id: 123,
    data: {
        skills: {
            "skill_1": {
                "title": "skill_1",
                "rating": 4,
                "description": 'description text'
            },
            "skill_2": {
                "title": "skill_2",
                "rating": 2,
                "description": 'description text'
            },
            "skill_3": {
                "title": "skill_3",
                "rating": 5,
                "description": 'description text'
            },
            ...
        }
    }
}

这就是我需要最后格式化数据的方式:

[
    {
        user_id: 123,
        skill_1: 4, 
        skill_2: 2, 
        skill_3: 5, 
                    ... 
    },
    {
        user_id: 456,
        skill_1: 1, 
        skill_2: 3, 
        skill_3: 4, 
                    ... 
    }
]

到目前为止,我正在使用如下查询:

SELECT
    user_id,
    data#>>'{skills, "skill_1",  rating}' AS "skill_1",
    data#>>'{skills, "skill_2",  rating}' AS "skill_2",
    data#>>'{skills, "skill_3",  rating}' AS "skill_3"
FROM some_table

必须有一种更好的方式来编写我的查询。有400多个行和70多个技能。我上面的查询有点疯狂。任何指导或帮助将不胜感激。

一些注意事项:

  1. 用户对70多种技能进行了评分
  2. 每个技能对象具有相同的结构
  3. 每个用户在完全相同的一组技能上给自己打分

1 个答案:

答案 0 :(得分:0)

db<>fiddle

我将您的测试数据扩展为(注意所有用户周围的数组):

[{
    "user_id": 123,
    "data": {
        "skills": {
            "skill_1": {
                "title": "skill_1",
                "rating": 4,
                "description": "description text"
            },
            "skill_2": {
                "title": "skill_2",
                "rating": 2,
                "description": "description text"
            },
            "skill_3": {
                "title": "skill_3",
                "rating": 5,
                "description": "description text"
            }
        }
    }
},
{
    "user_id": 456,
    "data": {
        "skills": {
            "skill_1": {
                "title": "skill_1",
                "rating": 1,
                "description": "description text"
            },
            "skill_2": {
                "title": "skill_2",
                "rating": 3,
                "description": "description text"
            },
            "skill_3": {
                "title": "skill_3",
                "rating": 4,
                "description": "description text"
            }
        }
    }
}]

查询:

SELECT 
    jsonb_pretty(jsonb_agg(user_id || skills))               -- E
FROM (
    SELECT
        json_build_object('user_id', user_id)::jsonb as user_id,  -- D
        json_object_agg(skill_title, skills -> skill_title -> 'rating')::jsonb as skills
    FROM (
        SELECT 
            user_id,
            json_object_keys(skills) as skill_title,         -- C
            skills
        FROM (
            SELECT
                (datasets -> 'user_id')::text as user_id,
                datasets -> 'data' -> 'skills' as skills     -- B
            FROM (
                SELECT 
                  json_array_elements(json) as datasets      -- A
                FROM (
                  SELECT '/* the JSON data; see db<>fiddle */'::json
                )s
            )s
        )s  
    )s    
    GROUP BY user_id
    ORDER BY user_id
)s

A使所有数组元素({user_id: '42', data: {...}})每一行

B第一列安全user_id。以后GROUP BY不能将JSON输出分组的强制转换为文本。对于第二列,提取用户的skills数据

C提取技能标题以用作(D.1)中的键。

D.1 skills -> skill_title -> 'rating'从每个技能中提取评分值

D.2 json_object_agg将skill_titles和每个相应的评分值聚合到一个JSON对象中;由user_id

分组

D.3 json_build_object使user_id再次成为JSON对象

E.1 user_id || skills将两个json对象聚合为一个

E.2将这些json对象jsonb_agg aggregates放入数组

E.3 jsonb_pretty使结果看起来很漂亮。

结果:

[{
    "skill_1": 4,
    "skill_2": 2,
    "skill_3": 5,
    "user_id": "123"
},
{
    "skill_1": 1,
    "skill_2": 3,
    "skill_3": 4,
    "skill_4": 42,
    "user_id": "456"
}]