Postgresql合并具有相同键的行(hstore或json)

时间:2015-05-04 15:16:29

标签: json postgresql multimap hstore

我有一张这样的表:

+--------+--------------------+   
|   ID   |   Attribute        |  
+--------+--------------------+ 
|    1   |"color" => "red"    |    
+--------+--------------------+  
|    1   |"color" => "green"  | 
+--------+--------------------+ 
|    1   |"shape" => "square" | 
+--------+--------------------+ 
|    2   |"color" => "blue"   | 
+--------+--------------------+ 
|    2   |"color" => "black"  | 
+--------+--------------------+ 
|    2   |"flavor" => "sweat" | 
+--------+--------------------+ 
|    2   |"flavor" => "salty" | 
+--------+--------------------+ 

我想运行一些postgres查询来获取这样的结果表:

+--------+------------------------------------------------------+   
|   ID   |                    Attribute                         |  
+--------+------------------------------------------------------+ 
|    1   |"color" => "red, green", "shape" => "square"          |    
+--------+------------------------------------------------------+  
|    2   |"color" => "blue, black", "flavor" => "sweat, salty"  | 
+--------+------------------------------------------------------+ 

属性列可以是hstore或json格式。我在hstore中写了一个例子,但如果我们无法在hstore中实现这一点,但在json中,我会将列更改为json。

我知道hstore不支持多个值的一个键,当我尝试一些合并方法时,它只为每个键保留一个值。但是对于json来说,我没有发现任何支持多值合并的东西。我认为这可以通过函数将相同键的值合并为字符串/文本并将其添加回键/值对来完成。但是我坚持实施它。

注意:如果在某个函数中实现此功能,理想情况下,任何键,如颜色,形状都不应出现在函数中,因为键可以动态扩展。

有没有人对此有任何想法?任何建议或头脑风暴都可能有所帮助。谢谢!

2 个答案:

答案 0 :(得分:0)

在其他任何事情之前只是一个注释:在你想要的输出中,我会使用一些正确的json而不是那种看起来像。所以根据我的正确输出是:

+--------+----------------------------------------------------------------------+   
|   ID   |                             Attribute                                |  
+--------+----------------------------------------------------------------------+ 
|    1   | '{"color":["red","green"], "flavor":[], "shape":["square"]}'         |    
+--------+----------------------------------------------------------------------+  
|    2   | '{"color":["blue","black"], "flavor":["sweat","salty"], "shape":[]}' | 
+--------+----------------------------------------------------------------------+ 

解析json属性并执行动态查询的PL / pgSQL函数可以完成这项工作,如下所示:

CREATE OR REPLACE FUNCTION merge_rows(PAR_table regclass) RETURNS TABLE (
    id          integer,
    attributes  json
) AS $$
DECLARE
    ARR_attributes  text[];
    VAR_attribute   text;
    ARR_query_parts text[];
BEGIN
    -- Get JSON attributes names
    EXECUTE format('SELECT array_agg(name ORDER BY name) AS name FROM (SELECT DISTINCT json_object_keys(attribute) AS name FROM %s) AS s', PAR_table) INTO ARR_attributes;

    -- Write json_build_object() query part
    FOREACH VAR_attribute IN ARRAY ARR_attributes LOOP
        ARR_query_parts := array_append(ARR_query_parts, format('%L, array_remove(array_agg(l.%s), null)', VAR_attribute, VAR_attribute));
    END LOOP;

    -- Return dynamic query
    RETURN QUERY EXECUTE format('
        SELECT t.id, json_build_object(%s) AS attributes 
            FROM %s AS t, 
            LATERAL json_to_record(t.attribute) AS l(%s) 
            GROUP BY t.id;', 
        array_to_string(ARR_query_parts, ', '), PAR_table, array_to_string(ARR_attributes, ' text, ') || ' text');
END;
$$ LANGUAGE plpgsql;

我已经测试了它似乎工作,它返回一个json。这是我的测试代码:

CREATE TABLE mytable (
    id          integer NOT NULL,
    attribute   json    NOT NULL

);
INSERT INTO mytable (id, attribute) VALUES 
(1, '{"color":"red"}'),
(1, '{"color":"green"}'),
(1, '{"shape":"square"}'),
(2, '{"color":"blue"}'),
(2, '{"color" :"black"}'),
(2, '{"flavor":"sweat"}'),
(2, '{"flavor":"salty"}');

SELECT * FROM merge_rows('mytable');

当然,您也可以将idattribute列名称作为参数传递,也许可以稍微改进一下这个功能,这只是为了给您一个想法。

编辑:如果您使用的是9.4,请考虑使用jsonb数据类型,它会更好,并为您提供改进的空间。您只需将json_*函数更改为jsonb_*等效函数。

答案 1 :(得分:0)

如果您只想将其用于显示目的,这可能就足够了:

select id, string_agg(key||' => '||vals, ', ')
from (
  select t.id, x.key, string_agg(value, ',') vals
  from t
   join lateral each(t.attributes) x on true
  group by id, key       
) t
group by id;

如果您不在9.4,则无法使用横向连接:

select id, string_agg(key||' => '||vals, ', ')
from (
  select id, key, string_agg(val, ',') as vals
  from (
    select t.id, skeys(t.attributes) as key, svals(t.attributes) as val
    from t
  ) t1 
  group by id, key
) t2
group by id;

这将返回:

id | string_agg                                
---+-------------------------------------------
 1 | color => red,green, shape => square       
 2 | color => blue,black, flavor => sweat,salty

SQLFiddle:http://sqlfiddle.com/#!15/98caa/2