每小时数据仓库报告

时间:2013-02-18 22:57:14

标签: postgresql report postgresql-9.1 data-warehouse

我的Postgresql 9.1数据库中有以下表格

 SELECT * from hour_dimension limit 10;
 id |    date    | hour
 - -+------------+------
 1  | 2013-01-01 |    5
 2  | 2013-01-01 |    6
 3  | 2013-01-01 |    7
 4  | 2013-01-01 |    8
 5  | 2013-01-01 |    9
 6  | 2013-01-01 |   10
 7  | 2013-01-01 |   11
 8  | 2013-01-01 |   12
 9  | 2013-01-01 |   13
10  | 2013-01-01 |   14



SELECT 

shop_id,
trans_date_time::date as date,
extract(hour from trans_date_time) as hour,
round(amount_in_cents/100.1,2) as amount

FROM transaction 
LIMIT 10;

shop_id |    date    | hour | amount
--------+------------+------+--------
 2877   | 2013-01-02 |    9 |   3.50
 2877   | 2013-01-02 |   10 |   4.00
 2877   | 2013-01-02 |   14 |   4.00
 2877   | 2013-01-03 |   11 |   1.40
 2877   | 2013-01-03 |   11 |   4.50
 2877   | 2013-01-03 |   12 |   3.00
 2877   | 2013-01-03 |   13 |   2.00
 2877   | 2013-01-03 |   13 |   2.00
 2877   | 2013-01-03 |   14 |   1.00
 2877   | 2013-01-04 |    9 |   4.00


 SELECT id  from shop limit 3;
 id
 ------
 2877
 2878
 2879

我正在尝试编写数据仓库类型查询,以便生成(并存储)每日报告,描述每家商店每小时的执行情况,类似于以下内容:

   date    | hour | shop_id | amount
-----------+------+----------+--------
2013-01-01 |    5 |     2877 |   0.00
2013-01-01 |    6 |     2877 |   0.00
2013-01-01 |    7 |     2877 |   0.00
2013-01-01 |    8 |     2877 |   0.00
2013-01-01 |    9 |     2877 |   3.50
2013-01-01 |   10 |     2877 |   4.00
2013-01-01 |   11 |     2877 |   5.90
2013-01-01 |   12 |     2877 |   3.00
2013-01-01 |   13 |     2877 |   4.00
2013-01-01 |   14 |     2877 |   1.00

示例查询:

SELECT hd.date as date, hd.hour as hour, 

shop_id,

round(sum(case when amount is null then 0 else amount end),2) as amount 

FROM (

    SELECT 

    shop_id,
    trans_date_time::date as date,
    extract(hour from trans_date_time) as hour,
    amount_in_cents/100.0 as amount
    FROM
    transaction

) x

RIGHT JOIN hour_dimension hd ON (hd.date = x.date AND hd.hour = x.hour)

AND shop_id = 2877
where hd.date = '2013-01-10'

GROUP BY hd.date, hd.hour, shop_id
ORDER by hd.date, hd.hour
LIMIT 10;

3 个答案:

答案 0 :(得分:1)

select 
    shop_id,
    trans_date_time::date as date,
    extract(hour from trans_date_time) as hour,
    round(sum(coalesce(amount_in_cents, 0))/100.0, 2) as amount
from transaction
group by 1, 2, 3
order by 1, 2, 3

答案 1 :(得分:1)

如果您可以从商店的桌子中选择商店ID号,您可能会获得更好的性能。我刚刚使用了SELECT DISTINCT子查询。交叉连接为您提供日期,小时和shop_id的每种组合。

with shop_hours as (
  select hd."date", hd."hour", tr.shop_id
  from hour_dimension hd
  cross join (select distinct shop_id from transaction) tr
)
select sh."date"::date, sh."hour", sh.shop_id, coalesce(sum(tr.amount), 0)
from shop_hours sh
left join transaction tr
       on tr.trans_date_time::date = sh."date"
      and tr.hour = sh."hour"
      and tr.shop_id = sh.shop_id
group by sh."date", sh."hour", sh.shop_id
order by sh.shop_id, sh."date", sh."hour"

答案 2 :(得分:1)

请尝试以下查询:

SELECT hd."date", hd.hour,
       s.shop_id,
       sum(coalesce(round(t.amount_in_cents/100.1,2),0)) amount
  FROM hour_dimension hd
  CROSS JOIN (SELECT DISTINCT shop_id FROM transaction) s
  LEFT JOIN transaction t
    ON hd."date"=t.trans_date_time::date
   AND hd.hour=extract(hour from t.trans_date_time)
 GROUP BY 1,2,3
 ORDER BY 1,2,3;

同样在SQL Fiddle

注意,使用date作为列名/别名并不好,'因为它是reserved keyword。你应该总是双引号,但最好避免它作为列名。

hour不是为PostgreSQL保留的,尽管SQL Standard已保留它。