每列设置值1-4

时间:2019-05-15 14:44:32

标签: postgresql window-functions postgresql-9.6

我有一个postgres 9.6数据库,其中包含一个具有如下所示的人员和国籍的表:

  person_id   nationality  
 ----------- ------------- 
          1   American     
          2   British      
          3   Canadian     
          3   Dutch        
          3   Ethiopian    
          3   French       
          3   German       

我正在制作一张表格用于分析目的,每人包含一行。我想为每人的前四个国籍添加四列。这是我的预期结果:

  person_id    nat_a     nat_b     nat_c     nat_d   
 ----------- ---------- ------- ----------- -------- 
          1   American                               
          2   British                                
          3   Canadian   Dutch   Ethiopian   French  

第3人的第五个国籍(德语)不可见,因为他是第五个。第1个人和第2个人的国籍B到D是NULL个。

我目前正在通过以下方式创建此表:

SELECT DISTINCT
    person_id,
    nth_value(nationality, 1) OVER w AS nat_a,
    nth_value(nationality, 2) OVER w AS nat_b,
    nth_value(nationality, 3) OVER w AS nat_c,
    nth_value(nationality, 4) OVER w AS nat_d
FROM nationalities
WINDOW w AS (PARTITION BY person_id ORDER BY nationality ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)

此查询按预期提供结果。但是,我对它的方法不太满意。因为nth_value是窗口函数,所以我需要指定一个窗口,然后应用DISTINCT操作。我更喜欢使用GROUP BY或类似名称。

有没有更有效的方法来解决这个问题?

2 个答案:

答案 0 :(得分:0)

您写的是要避免使用窗口函数。但是,此答案正在使用row_number函数,但不需要DISTINCT。也许有帮助。

如注释中所述:您要创建数据透视表。但是,您需要创建一个条件来告诉查询哪个元素进入第一,第二,...新列。例如,每个组中的行号。只需使用窗口函数即可完成。

demo: db<>fiddle

WITH ordered AS (
    SELECT 
        *,
        row_number() OVER (PARTITION BY person_id ORDER BY nationality)
    FROM
        nationalities
)
SELECT
    person_id,
    MAX(nationality) FILTER (WHERE row_number = 1) AS nat_a,
    MAX(nationality) FILTER (WHERE row_number = 2) AS nat_b,
    MAX(nationality) FILTER (WHERE row_number = 3) AS nat_c,
    MAX(nationality) FILTER (WHERE row_number = 4) AS nat_d
FROM
    ordered
GROUP BY person_id
ORDER BY person_id

无窗口功能的解决方案:

demo: db<>fiddle

WITH ordered AS (
    SELECT 
        *
    FROM (
        SELECT 
            person_id,
            array_agg(nationality ORDER BY nationality) AS a
        FROM
            nationalities
        GROUP BY person_id
    ) s,
    unnest(a) WITH ORDINALITY AS a(nationality, ordinality)
)
SELECT
    person_id,
    MAX(nationality) FILTER (WHERE ordinality = 1) AS nat_a,
    MAX(nationality) FILTER (WHERE ordinality = 2) AS nat_b,
    MAX(nationality) FILTER (WHERE ordinality = 3) AS nat_c,
    MAX(nationality) FILTER (WHERE ordinality = 4) AS nat_d
FROM
    ordered
GROUP BY person_id
ORDER BY person_id

此查询按ID汇总所有国籍,并按常规取消嵌套。也会生成行号。

但是此版本要慢得多: demo: db<>fiddle实际上,在这种情况下,您的版本似乎是最快的。

答案 1 :(得分:0)

如果您不想使用WINDOW函数,则可以使用Postgres的LATERAL sub-queries

SELECT DISTINCT person_id, a.nat_a, b.nat_b, c.nat_c, d.nat_d
FROM nationalities
    -- --------------------------------------------------------
    -- A
    INNER JOIN LATERAL
    (
        SELECT person_id, MIN(nationality)
        FROM nationalities
        GROUP BY person_id
    ) AS a(person, nat_a) ON a.person = nationalities.person_id
    -- --------------------------------------------------------
    -- B
    LEFT JOIN LATERAL
    (
        SELECT person_id, MIN(nationality)
        FROM nationalities
        WHERE nationality > a.nat_a
        GROUP BY person_id
    ) AS b(person, nat_b) ON b.person = nationalities.person_id
    -- --------------------------------------------------------
    -- C
    LEFT JOIN LATERAL
    (
        SELECT person_id, MIN(nationality)
        FROM nationalities
        WHERE nationality > b.nat_b
        GROUP BY person_id
    ) AS c(person, nat_c) ON b.person = nationalities.person_id
    -- --------------------------------------------------------
    -- D
    LEFT JOIN LATERAL
    (
        SELECT person_id, MIN(nationality)
        FROM nationalities
        WHERE nationality > c.nat_c
        GROUP BY person_id
    ) AS d(person, nat_d) ON d.person = nationalities.person_id

由于您是按字母顺序排序,因此nat_a将始终为MIN(nationality)。连续的横向加入(对于只有1个国籍的人使用LEFT JOIN)可以查看“下一个MIN”国籍。