Question

我有下表，需要以下输出

被星号包围的列名具有索引。

agent_group

| id (INTEGER)   | **agent_id** (INTEGER) | **group_id** (INTEGER)|
| 1              | 87204                  | 29          |  
| 2              | 87204                  | 34          |
| 3              | 87204                  | 44          | 
| 4              | 87203                  | 38          | 
| 5              | 87203                  | 44          | 
| 6              | 87202                  | 42          | 
| 7              | 87202                  | 46          |

组

| **id**| **name**  (VARCHAR)            | **group_type_id** (INTEGER) | **customer_id** (INTEGER)
| 29    | Engineering                    | 1                           | 35
| 34    | Product Management             | 1                           | 35
| 38    | sales                          | 1                           | 35
| 42    | Support                        | 1                           | 35
| 44    | New York                       | 2                           | 35
| 46    | Chicago                        | 2                           | 35
| 49    | Florida                        | 2                           | 45

group_type

| **id**    | name  (VARCHAR)       
| 1         | Department
| 2         | Location    
| 3         | position

输出：

 | agent_id    | location     | department       
 | 87202       | Chicago      |  Support 
 | 87203       | New York     |  Sales
 | 87204       | New York     |  Engineering,Product Management

想象一下，agent_group和group中有成千上万的行。尝试了以下查询，但是性能很慢。

 select inner_groups.agent_id, inner_groups.groups->>'location' as location, inner_groups.groups->>'department' as department
       from (
                    select agent_id,json_object_agg(gt.name, g.name) as groups from 
                    agent_group ag 
                    join group g on g.id = ag.group_id
                    join group_type gt on gt.id = g.group_type_id
                    where g.customer_id = 35 and gt.name in ('location', 'department')
                    group by agent_id
         ) inner_groups

查询计划：

Subquery Scan on inner_groups  (cost=8816.65..9264.37 rows=12792 width=68) (actual time=236.937..660.838 rows=100783 loops=1)                                                                
  ->  GroupAggregate  (cost=8816.65..9072.49 rows=12792 width=36) (actual time=236.927..474.640 rows=100783 loops=1)                                                                         
        Group Key: ag.agent_id                                                                                                                                                     
        ->  Sort  (cost=8816.65..8848.63 rows=12792 width=131) (actual time=236.908..282.209 rows=263217 loops=1)                                                                            
              Sort Key: ag.agent_id                                                                                                                                                
              Sort Method: external merge  Disk: 11432kB                                                                                                                                     
              ->  Nested Loop  (cost=0.84..7944.04 rows=12792 width=131) (actual time=0.214..121.701 rows=263217 loops=1)                                                                    
                    ->  Nested Loop  (cost=0.42..2696.52 rows=2107 width=131) (actual time=0.207..5.654 rows=14619 loops=1)                                                                  
                          ->  Seq Scan on group_type gt  (cost=0.00..11.50 rows=2 width=122) (actual time=0.008..0.014 rows=6 loops=1)                                             
                                Filter: ((name)::text = ANY ('{location,department}'::text[]))                                                                                               
                                Rows Removed by Filter: 5                                                                                                                                    
                          ->  Index Scan using group_group_type_id_idx on group g  (cost=0.42..1227.57 rows=11494 width=17) (actual time=0.056..0.702 rows=2436 loops=6)  
                                Index Cond: (group_type_id = gt.id)                                                                                                                
                                Filter: (customer_id = 45)                                                                                                                                   
                                Rows Removed by Filter: 262                                                                                                                                  
                    ->  Index Scan using agent_group_agent_idx on agent_group ag  (cost=0.42..1.91 rows=58 width=8) (actual time=0.002..0.006 rows=18 loops=14619)        
                          Index Cond: (group_id = g.id)                                                                                                                            
Planning time: 0.465 ms                                                                                                                                                                      
Execution time: 667.866 ms

Answer 1

您似乎想要使用string_agg()进行条件聚合。在Postgres中，您可以使用便捷的filter语法：

select agent_id,
       string_agg(distinct g.name, ',') filter (where gt.name = 'location') as locations,
       string_agg(distinct g.name, ',') filter (where gt.name = 'department') as department
from agent_group ag join
     group g
     on cg.id = ag.group_id join
     group_type gt
     on gt.id = g.group_type_id
where g.customer_id = 35 and
      gt.name in ('location', 'department')
group by agent_id;

Answer 2

如果组非常多，则唯一的选择是（慢）排序和组聚合。

如果agent_group很大，嵌套循环联接将为您节省排序时间。

如果组的数量不太大，请尝试增加work_mem以获得哈希聚合。那应该快得多。

提高postgres sql的性能-版本10.5

2 个答案: