Question

我的redshift中的表包含一些连接的id

Product_id , options_id
1,           2
5,           5;9;7
52,          4;5;8,11

我想把我的每一张桌子分开：

 Product_id , options_id
 1 ,           2
 5,            5
 5,            9
 5,            7
 52,           4
 52,           5
 52,           9

在redshift的文档中，我找到了类似的功能＆＃39; split part＆＃39;但是使用此功能我必须输入我想要获得exp的部分的编号：

Product_id , options_id
5,          5;9;7

split_part(options_id,';',2)将返回9，

请帮忙感谢。

Answer 1

从这个答案中窃取Split column into multiple rows in Postgres

select product_id, p.option
from product_options po,
     unnest(string_to_array(po.options_id, ';')) p(option)

sqlfiddle

Answer 2

所以，这里的问题是取一行并将其拆分成多行。这在PostgreSQL中并不太难 - 您可以使用unnest()函数。

但是， Amazon Redshift并未实现PostgreSQL中可用的所有功能，并且unnest()不受支持。

虽然可以在Redshift中编写User Defined Function，但该函数只能返回一个值，而不能返回几行。

一个好的选择是遍历每个部分，依次提取每个部分。请参阅Error while using regexp_split_to_table (Amazon Redshift)中的变通方法以获得一个聪明的实现（但仍然是一个黑客攻击）。这与Expanding JSON arrays to rows with SQL on RedShift类似。

底线是你可以提出一些在有限程度上工作的黑客攻击，但最好的选择是在将数据加载到Amazon Redshift之前清理数据。目前，Redshift已经过优化，可以对大量数据进行极快的查询，但在数据处理方面并不完备。这可能会在将来发生变化（就像用户定义的功能最初不可用）但是现在我们必须在其当前功能中工作。

在红移中拆分一个字段

2 个答案: