优化PostgreSql查询以获取找到的记录总数,并根据多个分组获取所需的有限行数

时间:2013-09-30 13:48:58

标签: sql postgresql amazon-redshift

我有这个查询,我试图根据给定的过滤器获取记录总数,并尝试在单个查询中获取分页所需的有限数量的数据。 之前我在两个步骤中执行此操作,其中我首先计算了记录的数量 Optimising Select number of rows in PostgreSql for multiple group by fields 然后根据页面大小和所需的偏移量在不同的查询中获得所需的行数

EXPLAIN select count (1) OVER () AS numrows , lower(column1) as column1, column2, column3, column4, column5, column6, column7, column8, sum(column9) as column9, sum(column10) as column10 
from tableName tablename 
where 
    column8 in (SOME 50-100 sets) 
    and column_date >= '2013-09-01' 
    and column_date < '2013-09-30' 
    group by lower(column1), column2, column3, column4, column5, column6, column7, column8 
    ORDER BY column9 desc 
    LIMIT 1000 OFFSET 0

XN Limit  (cost=1000134721702.61..1000134721702.63 rows=1000 width=67)
  ->  XN Merge  (cost=1000134721702.61..1000135118514.84 rows=158724893 width=67)
        Merge Key: sum(column9)
        ->  XN Network  (cost=1000134721702.61..1000135118514.84 rows=158724893 width=67)
              Send to leader
              ->  XN Sort  (cost=1000134721702.61..1000135118514.84 rows=158724893 width=67)
                    Sort Key: sum(column9)
                    ->  XN Window  (cost=107149638.61..113101822.10 rows=158724893 width=67)
                          ->  XN Network  (cost=107149638.61..108340075.31 rows=158724893 width=67)
                                Send to slice 0
                                ->  XN HashAggregate  (cost=107149638.61..108340075.31 rows=158724893 width=67)
                                      ->  XN Seq Scan on tableName tablename  (cost=0.00..67468415.44 rows=1587248927 width=67)
                                            Filter: ((column_date < '2013-09-30'::date) AND (column_date >= '2013-09-01'::date) AND (column8 = ANY ('{SOME 50-100 sets}'::integer[])))

版本细节

PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.666 

由于

1 个答案:

答案 0 :(得分:3)

Redshift未针对分页记录返回进行优化。它旨在最大限度地提高涉及许多记录但仅返回少量输出的分析查询的性能。

如果您打算对结果进行分页,我强烈建议您先将SELECT的输出发送到临时表。然后,您可以对该表执行简单(快速!)COUNT(*)并对其进行分页,而不必强制重新执行查询。