我遇到了" sub select"查询查询:
select
f.timestamp::date as date,
user_id,
activity_type,
f.container_id as group_id,
(
select
string_agg(distinct("userId"), ',') as group_owners
from
jusers_groups_copy g
where
g.place_id = f.container_id
and state like 'owner'
) as group_owners
from
fact_activity f
where
f.container_type like '700'
and f.timestamp::date < to_date('2016-09-05', 'YYYY-MM-DD')
group by
date, user_id, activity_type, group_id
order by
date, user_id, activity_type, group_id
实际上,string_add内部需要20秒才能渲染。我使用pgAdmin来解释查询,他给了我这条消息:
"Group (cost=7029.62..651968.20 rows=17843 width=27) (actual time=431.017..4513.973 rows=11483 loops=1)"
" Buffers: shared hit=139498 read=411, temp read=255 written=255"
" -> Sort (cost=7029.62..7074.90 rows=18111 width=27) (actual time=430.630..667.098 rows=54660 loops=1)"
" Sort Key: ((f."timestamp")::date), f.user_id, f.activity_type, f.container_id"
" Sort Method: external merge Disk: 2008kB"
" Buffers: shared hit=1702 read=411, temp read=255 written=255"
" -> Seq Scan on fact_activity f (cost=0.00..5748.76 rows=18111 width=27) (actual time=0.107..188.827 rows=54660 loops=1)"
" Filter: ((container_type ~~ '700'::text) AND (("timestamp")::date < to_date('2016-09-05'::text, 'YYYY-MM-DD'::text)))"
" Rows Removed by Filter: 125414"
" Buffers: shared hit=1691 read=411"
" SubPlan 1"
" -> Aggregate (cost=36.12..36.13 rows=1 width=5) (actual time=0.315..0.318 rows=1 loops=11483)"
" Buffers: shared hit=137796"
" -> Seq Scan on users_groups_copy g (cost=0.00..36.09 rows=11 width=5) (actual time=0.041..0.266 rows=13 loops=11483)"
" Filter: ((state ~~ 'owner'::text) AND (place_id = f.container_id))"
" Rows Removed by Filter: 1593"
" Buffers: shared hit=137796"
"Total runtime: 4536.074 ms"
此外,我尝试加入表格但请求速度更慢,如下所示:
select
f.timestamp::date as date,
user_id,
activity_type,
f.container_id as group_id,
string_agg(distinct("userId"), ',') as group_owners
from
fact_activity f
join jusers_groups_copy g
on g.place_id = f.container_id
where
f.container_type like '700'
and f.timestamp::date < to_date('2016-09-05', 'YYYY-MM-DD')
and g.state like 'owner'
group by
date, user_id, activity_type, group_id
order by
date, user_id, activity_type, group_id
最后,这个数据库中有任何索引,这是为什么请求缓慢?
我想知道如何改进此请求。
提前致谢
答案 0 :(得分:1)
在不更改查询的情况下,最大的性能提升将是子选择中加速子选择的表上的索引:
CREATE INDEX nice_name ON jusers_groups_copy(place_id, state text_pattern_ops);
但我会将查询重写为连接。这样,您可能会获得比嵌套循环更高效的内容,具体取决于您的数据。
而不是
SELECT f.somecol,
(SELECT g.othercol
FROM jusers_groups_copy g
WHERE g.place_id = f.container_id
AND g.state LIKE 'owner')
FROM fact_activity f
WHERE ...;
你应该写
SELECT f.somecol, g.othercol
FROM fact_activity f
JOIN jusers_groups_copy g
ON g.place_id = f.container_id
WHERE g.state LIKE 'owner'
AND ...;
根据所选的连接类型,上面的索引(对于嵌套循环)或不同的索引可以使查询更快。
答案 1 :(得分:0)