Question

我有一张这样的表：

| Col1       | Col2        |
|:-----------|------------:|
| 1          |        a;b; |    
| 1          |        b;c; |
| 2          |        c;d; |
| 2          |        d;e; |

我希望结果是这样的。

| Col1       | Col2        |
|:-----------|------------:|
| 1          |       a;b;c;|
| 2          |       c;d;e;|

是否有某种方法可以编写一个set函数，它将列中的唯一值添加到数组中然后显示它们。我使用的Redshift数据库主要使用postgresql，但有以下区别： Unsupported PostgreSQL Functions

Answer 1

查看Redshift的listagg()功能，该功能类似于MySQL group_concat。您需要先拆分项目，然后使用listagg()为您提供值列表。但请注意，正如文档所述：

LISTAGG不支持DISTINCT表达式

（编辑：截至2018年10月11日，现在支持DISTINCT。请参阅the docs。）

因此必须自己照顾。假设您已设置以下表格：

create table _test (col1 int, col2 varchar(10));
insert into _test values (1, 'a;b;'), (1, 'b;c;'), (2, 'c;d;'), (2, 'd;e;');

`Col2`

中的固定项目数

执行与split_part()中的项目一样多的Col2次操作：

select
    col1
  , listagg(col2, ';') within group (order by col2)
from (
        select col1, split_part(col2, ';', 1) as col2 from _test
  union select col1, split_part(col2, ';', 2) as col2 from _test
)
group by col1
;

`Col2`

中的项目数量不同

你需要一个帮手。如果表格中的行数多于Col2中的项目，则row_number()的变通方法可以正常工作（但对于大型表格来说却很昂贵）：

with _helper as (
    select
        (row_number() over())::int as part_number
    from
        _test
),
_values as (
    select distinct
        col1
      , split_part(col2, ';', part_number) as col2
    from
        _test, _helper
    where
        length(split_part(col2, ';', part_number)) > 0
)
select
    col1
  , listagg(col2, ';') within group (order by col2) as col2
from
    _values
group by
    col1
;

Redshift中的Group_concat

1 个答案:

`Col2`

`Col2`