我有一个类似这样的查询,其中联接〜6000个值
SELECT DISTINCT ON(user_id)
user_id,
finished_at as last_deposit_date,
CASE When currency = 'RUB' Then amount_cents END as last_deposit_amount_cents
FROM payments
JOIN (VALUES (5),(22),(26)) --~6000 values
AS v(user_id) USING (user_id)
WHERE action = 'deposit'
AND success = 't'
AND currency IN ('RUB')
ORDER BY user_id, finished_at DESC
用于查询具有许多值的查询计划:
Unique (cost=444606.97..449760.44 rows=19276 width=24) (actual time=6129.403..6418.317 rows=5991 loops=1)
Buffers: shared hit=2386527, temp read=7807 written=7808
-> Sort (cost=444606.97..447183.71 rows=1030695 width=24) (actual time=6129.401..6295.457 rows=1877039 loops=1)
Sort Key: payments.user_id, payments.finished_at DESC
Sort Method: external merge Disk: 62456kB
Buffers: shared hit=2386527, temp read=7807 written=7808
-> Nested Loop (cost=0.43..341665.35 rows=1030695 width=24) (actual time=0.612..5085.376 rows=1877039 loops=1)
Buffers: shared hit=2386521
-> Values Scan on "*VALUES*" (cost=0.00..75.00 rows=6000 width=4) (actual time=0.002..4.507 rows=6000 loops=1)
-> Index Scan using index_payments_on_user_id on payments (cost=0.43..54.78 rows=172 width=28) (actual time=0.010..0.793 rows=313 loops=6000)
Index Cond: (user_id = "*VALUES*".column1)
Filter: (success AND ((action)::text = 'deposit'::text) AND ((currency)::text = 'RUB'::text))
Rows Removed by Filter: 85
Buffers: shared hit=2386521
Planning time: 5.886 ms
Execution time: 6429.685 ms
我使用PosgreSQL 10.8.0。有没有机会加快此查询的速度?
我尝试用递归替换DISTINCT:
WITH RECURSIVE t AS (
(SELECT min(user_id) AS user_id FROM payments)
UNION ALL
SELECT (SELECT min(user_id) FROM payments
WHERE user_id > t.user_id
) AS user_id FROM
t
WHERE t.user_id IS NOT NULL
)
SELECT payments.* FROM t
JOIN (VALUES (5),(22),(26)) --~6000 VALUES
AS v(user_id) USING (user_id)
, LATERAL (
SELECT user_id,
finished_at as last_deposit_date,
CASE When currency = 'RUB' Then amount_cents END as last_deposit_amount_cents FROM payments
WHERE payments.user_id=t.user_id
AND action = 'deposit'
AND success = 't'
AND currency IN ('RUB')
ORDER BY finished_at DESC LIMIT 1
) AS payments
WHERE t.user_id IS NOT NULL;
但是事实证明,它甚至更慢。
哈希联接(成本= 418.67..21807.22行= 3000宽度= 24)(实际时间= 16.804..10843.174行= 5991循环= 1) 哈希值:(t.user_id =“ VALUES ”。column1) 缓冲区:共享命中= 6396763 CTE吨 ->递归联合(成本= 0.46..53.73行= 101宽度= 8)(实际时间= 0.142..1942.351行= 237029循环= 1) 缓冲区:共享命中= 864281 ->结果(成本= 0.46..0.47行= 1宽度= 8)(实际时间= 0.141..0.142行= 1循环= 1) 缓冲区:共享命中= 4 InitPlan 3(返回$ 1) ->限制(费用= 0.43..0.46行= 1宽度= 8)(实际时间= 0.138..0.139行= 1循环= 1) 缓冲区:共享命中= 4 ->仅索引扫描使用index_payments_on_user_id上的付款payment_2(成本= 0.43..155102.74行= 4858092宽度= 8)(实际时间= 0.137..0.138行= 1循环= 1) 索引条件:(user_id不为空) 堆获取:0 缓冲区:共享命中= 4 ->在t t_1上进行工作表扫描(成本= 0.00..5.12行= 10宽度= 8)(实际时间= 0.008..0.008行= 1循环= 237029) 过滤器:(user_id不为NULL) 筛选器删除的行:0 缓冲区:共享命中= 864277 子计划2 ->结果(成本= 0.48..0.49行= 1宽度= 8)(实际时间= 0.007..0.007行= 1循环= 237028) 缓冲区:共享命中= 864277 InitPlan 1(返回$ 3) ->限制(费用= 0.43..0.48行= 1宽度= 8)(实际时间= 0.007..0.007行= 1循环= 237028) 缓冲区:共享命中= 864277 ->仅索引扫描使用index_payments_on_user_id上的付款payment_1(成本= 0.43..80786.25行= 1619364宽度= 8)(实际时间= 0.007..0.007行= 1循环= 237028) 索引条件:((user_id不为空)AND(user_id> t_1.user_id)) 堆访存量:46749 缓冲区:共享命中= 864277 ->嵌套循环(成本= 214.94..21498.23行= 100宽度= 32)(实际时间= 0.475..10794.535行= 167333循环= 1) 缓冲区:共享命中= 6396757 ->在t上进行CTE扫描(成本= 0.00..2.02行= 100宽度= 8)(实际时间= 0.145..1998.788行= 237028循环= 1) 过滤器:(user_id不为NULL) 筛选器删除的行:1 缓冲区:共享命中= 864281 ->限制(cost = 214.94..214.94行= 1宽度= 24)(实际时间= 0.037..0.037行= 1循环= 237028) 缓冲区:共享命中= 5532476 ->排序(成本= 214.94..215.37行= 172宽度= 24)(实际时间= 0.036..0.036行= 1循环= 237028) 排序关键字:payment.finished_at DESC 排序方式:quicksort内存:25kB 缓冲区:共享命中= 5532476 ->使用index_payments_on_user_id进行支付时的索引扫描(成本= 0.43..214.08行= 172宽度= 24)(实际时间= 0.003..0.034行= 15循环= 237028) 索引条件:(user_id = t.user_id) 过滤器:(成功AND((操作)::文本='存款'::文本)AND((货币)::文本='RUB'::文本)) 筛选器删除的行:6 缓冲区:共享命中= 5532473 ->哈希(成本= 75.00..75.00行= 6000宽度= 4)(实际时间= 2.255..2.255行= 6000循环= 1) 存储桶:8192批次:1内存使用量:275kB ->在“ VALUES ”上扫描值(成本= 0.00..75.00行= 6000宽度= 4)(实际时间= 0.004..1.206行= 6000循环= 1) 计划时间:7.029毫秒 执行时间:10846.774 ms
答案 0 :(得分:1)
对于此查询:
SELECT DISTINCT ON (user_id)
p.user_id,
p.finished_at as last_deposit_date,
(CASE WHEN p.currency = 'RUB' THEN p.amount_cents END) as last_deposit_amount_cents
FROM payments p JOIN
(VALUES (5),( 22), (26) --~6000 values
) v(user_id)
USING (user_id)
WHERE p.action = 'deposit' AND
p.success = 't' ND
p.currency = 'RUB'
ORDER BY p.user_id, p.finished_at DESC;
我不完全理解CASE
表达式,因为WHERE
正在过滤掉所有其他值。
也就是说,我希望(action, success, currency, user_id, finished_at desc)
上的索引会有所帮助。