我有一个postgres 9.6数据库,其中包含一个具有如下所示的人员和国籍的表:
person_id nationality
----------- -------------
1 American
2 British
3 Canadian
3 Dutch
3 Ethiopian
3 French
3 German
我正在制作一张表格用于分析目的,每人包含一行。我想为每人的前四个国籍添加四列。这是我的预期结果:
person_id nat_a nat_b nat_c nat_d
----------- ---------- ------- ----------- --------
1 American
2 British
3 Canadian Dutch Ethiopian French
第3人的第五个国籍(德语)不可见,因为他是第五个。第1个人和第2个人的国籍B到D是NULL
个。
我目前正在通过以下方式创建此表:
SELECT DISTINCT
person_id,
nth_value(nationality, 1) OVER w AS nat_a,
nth_value(nationality, 2) OVER w AS nat_b,
nth_value(nationality, 3) OVER w AS nat_c,
nth_value(nationality, 4) OVER w AS nat_d
FROM nationalities
WINDOW w AS (PARTITION BY person_id ORDER BY nationality ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
此查询按预期提供结果。但是,我对它的方法不太满意。因为nth_value
是窗口函数,所以我需要指定一个窗口,然后应用DISTINCT
操作。我更喜欢使用GROUP BY
或类似名称。
有没有更有效的方法来解决这个问题?
答案 0 :(得分:0)
您写的是要避免使用窗口函数。但是,此答案正在使用row_number函数,但不需要DISTINCT。也许有帮助。
如注释中所述:您要创建数据透视表。但是,您需要创建一个条件来告诉查询哪个元素进入第一,第二,...新列。例如,每个组中的行号。只需使用窗口函数即可完成。
WITH ordered AS (
SELECT
*,
row_number() OVER (PARTITION BY person_id ORDER BY nationality)
FROM
nationalities
)
SELECT
person_id,
MAX(nationality) FILTER (WHERE row_number = 1) AS nat_a,
MAX(nationality) FILTER (WHERE row_number = 2) AS nat_b,
MAX(nationality) FILTER (WHERE row_number = 3) AS nat_c,
MAX(nationality) FILTER (WHERE row_number = 4) AS nat_d
FROM
ordered
GROUP BY person_id
ORDER BY person_id
无窗口功能的解决方案:
WITH ordered AS (
SELECT
*
FROM (
SELECT
person_id,
array_agg(nationality ORDER BY nationality) AS a
FROM
nationalities
GROUP BY person_id
) s,
unnest(a) WITH ORDINALITY AS a(nationality, ordinality)
)
SELECT
person_id,
MAX(nationality) FILTER (WHERE ordinality = 1) AS nat_a,
MAX(nationality) FILTER (WHERE ordinality = 2) AS nat_b,
MAX(nationality) FILTER (WHERE ordinality = 3) AS nat_c,
MAX(nationality) FILTER (WHERE ordinality = 4) AS nat_d
FROM
ordered
GROUP BY person_id
ORDER BY person_id
此查询按ID汇总所有国籍,并按常规取消嵌套。也会生成行号。
但是此版本要慢得多: demo: db<>fiddle实际上,在这种情况下,您的版本似乎是最快的。
答案 1 :(得分:0)
如果您不想使用WINDOW函数,则可以使用Postgres的LATERAL sub-queries:
SELECT DISTINCT person_id, a.nat_a, b.nat_b, c.nat_c, d.nat_d FROM nationalities -- -------------------------------------------------------- -- A INNER JOIN LATERAL ( SELECT person_id, MIN(nationality) FROM nationalities GROUP BY person_id ) AS a(person, nat_a) ON a.person = nationalities.person_id -- -------------------------------------------------------- -- B LEFT JOIN LATERAL ( SELECT person_id, MIN(nationality) FROM nationalities WHERE nationality > a.nat_a GROUP BY person_id ) AS b(person, nat_b) ON b.person = nationalities.person_id -- -------------------------------------------------------- -- C LEFT JOIN LATERAL ( SELECT person_id, MIN(nationality) FROM nationalities WHERE nationality > b.nat_b GROUP BY person_id ) AS c(person, nat_c) ON b.person = nationalities.person_id -- -------------------------------------------------------- -- D LEFT JOIN LATERAL ( SELECT person_id, MIN(nationality) FROM nationalities WHERE nationality > c.nat_c GROUP BY person_id ) AS d(person, nat_d) ON d.person = nationalities.person_id
由于您是按字母顺序排序,因此nat_a
将始终为MIN(nationality)
。连续的横向加入(对于只有1个国籍的人使用LEFT JOIN
)可以查看“下一个MIN”国籍。