在postgresql 8.4中解释group by子句的计划问题

时间:2012-09-21 06:54:24

标签: database postgresql explain postgresql-8.4 sql-execution-plan

下面给出详细解释,该解释与逐个条款问题的解释计划有关。

表:web_categoryutfv1_24hr_ts_201209
列:“5mintime”,类别,命中,字节,appid
行:871
索引:“web_categoryutfv1_24hr_ts_201209_idx”btree(“5mintime”)

我正在运行以下查询:

select count(*) over () as t_totalcnt,
       max(hits) over () as t_maxhits,
       max(bytes) over () as t_maxbytes,
       *
from (
  select category,
         sum(hits) as hits,
         sum(bytes) as bytes
  from (
    select "5mintime",
           category,
           hits,
           bytes,
           appid,
           0 as tmpfield
    from web_categoryutfv1_24hr_ts_201209
    where "5mintime" >= '2012-09-12 00:00:00'
    and   "5mintime" < '2012-09-19 00:00:00'
  ) as tmp
  where "5mintime" >= '2012-09-12 00:00:00'
  and   "5mintime" <= '2012-09-18 23:59:59'
  and   appid in ('')
  group by category
  order by hits desc
) as foo limit 10

我从t_totalcnt变量获得了总行返回55。现在我分析了web_categoryutfv1_24hr_ts_201209表并再次使用explain

运行相同的查询

我得到以下执行计划:

-> Limit  (cost=31.31..31.61 rows=10 width=580)
->  WindowAgg  (cost=31.31..32.03 rows=24 width=580)
->  Subquery Scan foo  (cost=31.31..31.61 ***rows=24*** width=580)
      ->  Sort  (cost=31.31..31.37 rows=24 width=31)
            Sort Key: (sum(web_categoryutfv1_24hr_ts_201209.hits))
            ->  HashAggregate  (cost=30.39..30.75 rows=24 width=31)
                  ->  Seq Scan on web_categoryutfv1_24hr_ts_201209  (cost=0.00..27.60 rows=373 width=31)
                      Filter: (("5mintime" >= '2012-09-12 00:00:00'::timestamp without time zone) AND ("5mintime" < '2012-09-19 00:00:00'::timestamp without time zone) AND ("5mintime" >= '2012-09-12 00:00:00'::timestamp without time zone) AND ("5mintime" <= '2012-09-18 23:59:59'::timestamp without time zone) AND ((appid)::text = ''::text))

现在我得到解释计划输出HashAggregate(成本= 30.39..30.75 行= 24 宽度= 31),其中行= 24而实际上总行返回应该是55.当我从查询中删除group by子句时,我在解释计划输出中获得了373行以及执行查询执行。

所以我想知道查询中的解释计划和分组条款是否有问题?

1 个答案:

答案 0 :(得分:1)

执行计划中显示的行是 估算值 。只要它们在正确范围内的某个地方就可以了。如果他们完全关闭,通常意味着您的统计数据已过期。

通过更改预期行数作为分组来删除组会减少它们。

所以我没有看到任何问题。

您可以使用explain analyze来比较执行计划中的实际数字和实际数字。