Question

我有一个Redshift表，其中包含以下列

请问我如何从此列中提取以cat_开头的值（每一行在数组中的不同位置只有一个）？

我想得到那些结果：

cat_incident

cat_feature_missing

cat_duplicated_request

谢谢！

Answer 1

没有简单的方法从SQL的一列（或至少不是Redshift使用的SQL中）提取多个值。

您可以编写一个User-Defined Function (UDF)来返回包含这些值的字符串，并用换行符分隔。是否可以接受取决于您希望对输出执行什么操作（例如，针对输出JOIN）。

另一种选择是在将数据加载到Redshift 之前对其进行预处理，以将该信息放入一对多的单独中表，每个值都在其自己的行中。返回此信息将很简单。

Answer 2

您可以使用计数表（带数字的表）来执行此操作。检查此链接以获取有关如何创建此表的信息：http://www.sqlservercentral.com/articles/T-SQL/62867/

以下是示例您将如何使用它。在现实生活中，您应该使用永久性的#tally表来代替。

--create sample table with data
create table #a (tags varchar(500));

insert into #a
select 'blah,cat_incident,mcr_close_ticket'
union
select 'blah-blah,cat_feature_missing,cat_duplicated_request';

--create tally table
create table #tally(n int);
insert into #tally
select 1
union select 2
union select 3
union select 4
union select 5
;

--get tags
select * from
(
select TRIM(SPLIT_PART(a.tags, ',', t.n)) AS single_tag
from #tally t
inner join #a a ON t.n <= REGEXP_COUNT(a.tags, ',') + 1 and n<1000
)
where single_tag like 'cat%'
;

Answer 3

谢谢！最后，我设法通过以下查询来做到这一点：

SELECT SUBSTRING（SUBSTRING（标签，charindex（'cat_'，标签），len（标签）），0，charindex（'，'，SUBSTRING（标签，charindex（'cat_'，标签），len（标签））））标签从表

Redshift-提取与数组中的条件匹配的值

3 个答案: