我们在Redshift中有一个大表,我们存储AWS账单文件并查询它们。我们使用交错排序键,因为并非所有查询都包含所有过滤器。
运行此查询
select "column", type, encoding, distkey, sortkey, "notnull"
from pg_table_def
where tablename = 'accountbillingflat'
and sortkey <> 0
order by sortkey;
给出这个结果
现在,如果我们执行以下查询
SELECT
payeraccountid,
linkedaccountid,
billingmonth,
updatedon
FROM
accountbillingflat
WHERE
linkedaccountid IN (
'123'
)
AND billingmonth IN (
201603
)
GROUP BY
payeraccountid,
linkedaccountid,
billingmonth,
updatedon
ORDER BY
billingmonth desc,
updatedon desc
查询执行计划是
XN Merge (cost=1000002059236.65..1000002059237.77 rows=445 width=44)
-> XN Network (cost=1000002059236.65..1000002059237.77 rows=445 width=44)
-> XN Sort (cost=1000002059236.65..1000002059237.77 rows=445 width=44)
-> XN HashAggregate (cost=2059217.08..2059217.08 rows=445 width=44)
-> XN Seq Scan on accountbillingflat (cost=0.00..2045174.64 rows=1404244 width=44)
实际执行是
查看顺序扫描的详细信息,我们看到了
这很奇怪,因为linkedaccountid和billingmonth都是排序键。