我有下表,需要以下输出
被星号包围的列名具有索引。
agent_group
| id (INTEGER) | **agent_id** (INTEGER) | **group_id** (INTEGER)|
| 1 | 87204 | 29 |
| 2 | 87204 | 34 |
| 3 | 87204 | 44 |
| 4 | 87203 | 38 |
| 5 | 87203 | 44 |
| 6 | 87202 | 42 |
| 7 | 87202 | 46 |
组
| **id**| **name** (VARCHAR) | **group_type_id** (INTEGER) | **customer_id** (INTEGER)
| 29 | Engineering | 1 | 35
| 34 | Product Management | 1 | 35
| 38 | sales | 1 | 35
| 42 | Support | 1 | 35
| 44 | New York | 2 | 35
| 46 | Chicago | 2 | 35
| 49 | Florida | 2 | 45
group_type
| **id** | name (VARCHAR)
| 1 | Department
| 2 | Location
| 3 | position
输出:
| agent_id | location | department
| 87202 | Chicago | Support
| 87203 | New York | Sales
| 87204 | New York | Engineering,Product Management
想象一下,agent_group和group中有成千上万的行。 尝试了以下查询,但是性能很慢。
select inner_groups.agent_id, inner_groups.groups->>'location' as location, inner_groups.groups->>'department' as department
from (
select agent_id,json_object_agg(gt.name, g.name) as groups from
agent_group ag
join group g on g.id = ag.group_id
join group_type gt on gt.id = g.group_type_id
where g.customer_id = 35 and gt.name in ('location', 'department')
group by agent_id
) inner_groups
查询计划:
Subquery Scan on inner_groups (cost=8816.65..9264.37 rows=12792 width=68) (actual time=236.937..660.838 rows=100783 loops=1)
-> GroupAggregate (cost=8816.65..9072.49 rows=12792 width=36) (actual time=236.927..474.640 rows=100783 loops=1)
Group Key: ag.agent_id
-> Sort (cost=8816.65..8848.63 rows=12792 width=131) (actual time=236.908..282.209 rows=263217 loops=1)
Sort Key: ag.agent_id
Sort Method: external merge Disk: 11432kB
-> Nested Loop (cost=0.84..7944.04 rows=12792 width=131) (actual time=0.214..121.701 rows=263217 loops=1)
-> Nested Loop (cost=0.42..2696.52 rows=2107 width=131) (actual time=0.207..5.654 rows=14619 loops=1)
-> Seq Scan on group_type gt (cost=0.00..11.50 rows=2 width=122) (actual time=0.008..0.014 rows=6 loops=1)
Filter: ((name)::text = ANY ('{location,department}'::text[]))
Rows Removed by Filter: 5
-> Index Scan using group_group_type_id_idx on group g (cost=0.42..1227.57 rows=11494 width=17) (actual time=0.056..0.702 rows=2436 loops=6)
Index Cond: (group_type_id = gt.id)
Filter: (customer_id = 45)
Rows Removed by Filter: 262
-> Index Scan using agent_group_agent_idx on agent_group ag (cost=0.42..1.91 rows=58 width=8) (actual time=0.002..0.006 rows=18 loops=14619)
Index Cond: (group_id = g.id)
Planning time: 0.465 ms
Execution time: 667.866 ms
答案 0 :(得分:0)
您似乎想要使用string_agg()
进行条件聚合。在Postgres中,您可以使用便捷的filter
语法:
select agent_id,
string_agg(distinct g.name, ',') filter (where gt.name = 'location') as locations,
string_agg(distinct g.name, ',') filter (where gt.name = 'department') as department
from agent_group ag join
group g
on cg.id = ag.group_id join
group_type gt
on gt.id = g.group_type_id
where g.customer_id = 35 and
gt.name in ('location', 'department')
group by agent_id;
答案 1 :(得分:0)
如果组非常多,则唯一的选择是(慢)排序和组聚合。
如果agent_group
很大,嵌套循环联接将为您节省排序时间。
如果组的数量不太大,请尝试增加work_mem
以获得哈希聚合。那应该快得多。