如何在一个表扫描中组合不同时间范围的多个查询

时间:2013-12-25 14:29:30

标签: sql postgresql

我有一张带有统计数据的大表(数百万行)。

表格def:  PROVIDER_ID,  花费,  日期

大多数提供商每月都会工作,因此我可以每月执行一次单独的查询,以便在这个月内花费这些时间。

select provider_id,sum(spent) from spent_table where date >= '20131201' group by 1;

然而,有些提供商每月都会工作,所以我需要花一些时间在自定义时间内。 为了让所有自定义提供者花费我做联合查询:

select provider_id,sum(spent) from spent_table where date between '20130930' and '20140101' and provider_id = 272 group by 1  
union
select provider_id,sum(spent) from spent_table where date between '20130730' and '20131201' and provider_id = 273 group by 1 

每个select都进行索引扫描,但我有50个自定义提供程序,因此union查询执行50个索引扫描查询。 为了在单次扫描中执行此操作,我能做些什么吗?

计划是:

HashAggregate  (cost=122297336.47..122297337.03 rows=56 width=12)
   ->  Append  (cost=0.00..122297336.19 rows=56 width=12)
         ->  GroupAggregate  (cost=0.00..2428542.88 rows=1 width=12)
               ->  Index Scan using date_idx on spent_table  (cost=0.00..2428448.33 rows=18908 width=12)
                     Index Cond: ((provider_id = 272) AND (date >= '2013-09-30 00:00:00'::timestamp without time zone) AND (date < '2014-01-01 00:00:00'::timestamp without time zone))
         ->  GroupAggregate  (cost=0.00..2428542.88 rows=1 width=12)
               ->  Index Scan using date_idx on spent_table  (cost=0.00..2428448.33 rows=18908 width=12)
                     Index Cond: ((provider_id = 262) AND (date >= '2013-09-30 00:00:00'::timestamp without time zone) AND (date < '2014-01-01 00:00:00'::timestamp without time zone))

谢谢

1 个答案:

答案 0 :(得分:2)

您始终按provider_id进行分组,每个查询的provider_id子句中都有不同的WHERE。 这意味着您可以肯定每个查询的结果都是离散的,您只需将所有条件与WHERE s的单个OR子句合并:

SELECT   provider_id, sum(spent)
FROM     spent_table 
WHERE    (date BETWEEN '20130930' AND '20140101' AND provider_id = 272) OR
         (date BETWEEN '20130730' AND '20131201' AND provider_id = 273)
GROUP BY provider_id