我有一张带有统计数据的大表(数百万行)。
表格def: PROVIDER_ID, 花费, 日期
大多数提供商每月都会工作,因此我可以每月执行一次单独的查询,以便在这个月内花费这些时间。
select provider_id,sum(spent) from spent_table where date >= '20131201' group by 1;
然而,有些提供商每月都会工作,所以我需要花一些时间在自定义时间内。 为了让所有自定义提供者花费我做联合查询:
select provider_id,sum(spent) from spent_table where date between '20130930' and '20140101' and provider_id = 272 group by 1
union
select provider_id,sum(spent) from spent_table where date between '20130730' and '20131201' and provider_id = 273 group by 1
每个select都进行索引扫描,但我有50个自定义提供程序,因此union查询执行50个索引扫描查询。 为了在单次扫描中执行此操作,我能做些什么吗?
计划是:
HashAggregate (cost=122297336.47..122297337.03 rows=56 width=12)
-> Append (cost=0.00..122297336.19 rows=56 width=12)
-> GroupAggregate (cost=0.00..2428542.88 rows=1 width=12)
-> Index Scan using date_idx on spent_table (cost=0.00..2428448.33 rows=18908 width=12)
Index Cond: ((provider_id = 272) AND (date >= '2013-09-30 00:00:00'::timestamp without time zone) AND (date < '2014-01-01 00:00:00'::timestamp without time zone))
-> GroupAggregate (cost=0.00..2428542.88 rows=1 width=12)
-> Index Scan using date_idx on spent_table (cost=0.00..2428448.33 rows=18908 width=12)
Index Cond: ((provider_id = 262) AND (date >= '2013-09-30 00:00:00'::timestamp without time zone) AND (date < '2014-01-01 00:00:00'::timestamp without time zone))
谢谢
答案 0 :(得分:2)
您始终按provider_id
进行分组,每个查询的provider_id
子句中都有不同的WHERE
。
这意味着您可以肯定每个查询的结果都是离散的,您只需将所有条件与WHERE
s的单个OR
子句合并:
SELECT provider_id, sum(spent)
FROM spent_table
WHERE (date BETWEEN '20130930' AND '20140101' AND provider_id = 272) OR
(date BETWEEN '20130730' AND '20131201' AND provider_id = 273)
GROUP BY provider_id