Question

我们在Redshift中有一个大表，我们存储AWS账单文件并查询它们。我们使用交错排序键，因为并非所有查询都包含所有过滤器。

运行此查询

select "column", type, encoding, distkey, sortkey, "notnull" 
from pg_table_def
where tablename = 'accountbillingflat' 
and sortkey <> 0
order by sortkey;

给出这个结果

现在，如果我们执行以下查询

SELECT
    payeraccountid,
    linkedaccountid,
    billingmonth,
    updatedon    
FROM
    accountbillingflat    
WHERE
    linkedaccountid IN (
        '123'    
    )    
    AND billingmonth IN (
        201603    
    )    
GROUP BY
    payeraccountid,
    linkedaccountid,
    billingmonth,
    updatedon    
ORDER BY
    billingmonth desc,
    updatedon desc

查询执行计划是

XN Merge  (cost=1000002059236.65..1000002059237.77 rows=445 width=44)
  ->  XN Network  (cost=1000002059236.65..1000002059237.77 rows=445 width=44)
        ->  XN Sort  (cost=1000002059236.65..1000002059237.77 rows=445 width=44)
              ->  XN HashAggregate  (cost=2059217.08..2059217.08 rows=445 width=44)
                    ->  XN Seq Scan on accountbillingflat  (cost=0.00..2045174.64 rows=1404244 width=44)

实际执行是

查看顺序扫描的详细信息，我们看到了

这很奇怪，因为linkedaccountid和billingmonth都是排序键。

即使使用排序键，也可以在Redshift表上扫描表

0 个答案: