Question

我在PostgreSQL中有一堆表，我按如下方式运行查询：

#!/bin/sh

git init test
cd test

touch a
git add a
git commit -m a

git checkout -b fix1
touch b
git add b
git commit -m b

git checkout -b fix2 master
touch c
git add c
git commit -m c

git checkout master
git merge --no-ff --no-edit fix1
git merge --no-ff --no-edit fix2

git revert --no-edit HEAD^ -m 1

git checkout fix1
echo "fix b" > b
git add b
git commit -m bb

git checkout master
# git merge fix1 # this will give an error!

# revert changes introduced by revert before merging
git revert --no-edit HEAD
git merge --no-ff --no-edit fix1

注意：rent_flats表包含大约500万行，rent_flats_linked_users包含大约600k行，用户包含350k行。其他表的大小很小。

执行查询需要大约6.8秒，解释分析显示大约99％的总时间用于Hash和Hash连接。

将seq_scan设置为off ...查询需要更长时间〜11秒

Here解释查询计划分析。我已将索引放在内部联接中涉及的字段以及过滤器中涉及的字段上，例如phone_numbers.priority和cities.short_period以及cities.long_period。如何进一步优化这一点并减少Hash和Hash Joins时间？

Answer 1

您的第二个WHERE条款不是sargable：

 AND (((extract(epoch from age(current_date,rent_flats.date_added))/86400)::int) IN (cities.short_period,cities.long_period))

如果涉及的列是date和integer类型（我们可以在表定义中看到），则可以重写为：

AND rent_flats.date_added IN (current_date - cities.short_period - 1
                            , current_date - cities.long_period - 1)

奇数谓词。你确定你不是这个意思吗？

AND rent_flats.date_added BETWEEN current_date - cities.short_period - 1
                              AND current_date - cities.long_period - 1

您可能会做更多事情，等待缺少信息。很可能沿着这些方向：

在SQL查询中优化哈希和哈希联接

1 个答案: