每小时分组数据并插入Postgres的汇总表

时间:2014-12-30 18:27:41

标签: sql postgresql

我有一个包含大约50个表的数据库(现在),由于数据量大量涌入几个表,我的任务是创建数据的每小时汇总并将其转储到另一个表中。所以运行关于原始数据的报告需要很长时间,因为新数据库(2周龄)已经在我正在为其提取数据的两个表中的一个表中达到了200k记录。

该查询为客户提供了三种可能的结果 - “cust1”,cust2“和”cust3“,每个都有一张选定的卡片(产品质量问卷的邮件回复和潜在的奖品获奖),这是13种选择之一。字母表示(“A”Ace,“K”King等相关值)

这是一个子查询和相应的结果:

select sp_cust_card_sequence(cards.cust1) as seq, cards.cust1 as card, count(cards.cust1) as card_count, rtt.game_id as game_id, gt.promo_id as promo_id, gt.choice_id as choice_id, extract(hour from header.start_timestamp) as hour, header.start_timestamp::timestamp::date as date
    from game_bac_cards cards
    inner join card_cust_resp header ON (header.id = cards.game_id)
    inner join game_table gt ON (header.promo_id = gt.promo_id)
    inner join ref_table_type rtt ON (gt.table_type_id = rtt.id)
    where result <> 'undef' 
    group by date, hour, card, rtt.game_id, gt.promo_id, gt.choice_id, cards.cust1

And the result:

基本上,我想将卡片列中的所有“K”值按小时分组。通过下面的查询,我似乎能够几乎实现这一目标,但下面的代码段显示“ K ”的值为小时“ 19 “on” 2014年12月4日“有两个条目,而不是一个条目。我确信有更优雅的方式来做到这一点。

最终查询:

select date, hour, card, seq, sum(card_count) as card_count, game_id, choice_id, promo_id 
from (
select date, hour, card, seq, sum(card_count) as card_count, game_id, choice_id, promo_id from (
    select sp_cust_card_sequence(cards.cust1) as seq, cards.cust1 as card, count(cards.cust1) as card_count, rtt.game_id as game_id, gt.promo_id as promo_id, gt.choice_id as choice_id, extract(hour from header.start_timestamp) as hour, header.start_timestamp::timestamp::date as date
    from game_bac_cards cards
    inner join card_cust_resp header ON (header.id = cards.game_id)
    inner join game_table gt ON (header.promo_id = gt.promo_id)
    inner join ref_table_type rtt ON (gt.table_type_id = rtt.id)
    where result <> 'undef' 
    group by date, hour, card, rtt.game_id, gt.promo_id, gt.choice_id, cards.cust1
) as cust1_table
where cust1_table.card is not null and cust1_table.card <> ''
group by date, hour, card_count, card, seq, game_id, choice_id, promo_id

union all

select date, hour, card, seq, sum(card_count) as card_count, game_id, choice_id, promo_id from (
    select sp_cust_card_sequence(cards.cust2) as seq, cards.cust2 as card, count(cards.cust2) as card_count, rtt.game_id as game_id, gt.promo_id as promo_id, gt.choice_id as choice_id, extract(hour from header.start_timestamp) as hour, header.start_timestamp::timestamp::date as date
    from game_bac_cards cards
    inner join card_cust_resp header ON (header.id = cards.game_id)
    inner join game_table gt ON (header.promo_id = gt.promo_id)
    inner join ref_table_type rtt ON (gt.table_type_id = rtt.id)
    where result <> 'undef' 
    group by date, hour, card, rtt.game_id, gt.promo_id, gt.choice_id, cards.cust2
) as cust2_table
where cust2_table.card is not null and cust2_table.card <> ''
group by date, hour, card_count, card, seq, game_id, choice_id, promo_id

union all

select date, hour, card, seq, sum(card_count) as card_count, game_id, choice_id, promo_id from (
    select sp_cust_card_sequence(cards.cust3) as seq, cards.cust3 as card, count(cards.cust3) as card_count, rtt.game_id as game_id, gt.promo_id as promo_id, gt.choice_id as choice_id, extract(hour from header.start_timestamp) as hour, header.start_timestamp::timestamp::date as date
    from game_bac_cards cards
    inner join card_cust_resp header ON (header.id = cards.game_id)
    inner join game_table gt ON (header.promo_id = gt.promo_id)
    inner join ref_table_type rtt ON (gt.table_type_id = rtt.id)
    where result <> 'undef' 
    group by date, hour, card, rtt.game_id, gt.promo_id, gt.choice_id, cards.cust3
) as cust3_table
where cust3_table.card is not null and cust3_table.card <> ''
group by date, hour, card_count, card, seq, game_id, choice_id, promo_id
) as card_details
and card_details.card is not null and card_details.card <> ''
group by date, hour, card, card_count, seq, game_id, choice_id, promo_id
order by date, hour, seq

结果仅显示卡 K |小时 19 |日期 2014年12月4日。这应该只有一行而不是两行。 (?)

enter image description here

非常感谢任何帮助!

1 个答案:

答案 0 :(得分:1)

尝试在外部查询

中的card_count上删除该组

分组按日期,小时, card_count ,卡片,seq,game_id,choice_id,promo_id