如何使用PostgresQL在存储桶中创建存储桶和组

时间:2018-07-03 17:31:37

标签: postgresql

如何查找按年份分配的信用卡,以及如何完成交易。将这些信用卡分为三类:少于10笔交易,10到30笔交易之间,超过30笔交易?

我尝试使用的第一种方法是在PostgresQL中使用width_buckets函数,但是文档说它只能创建等距的存储桶,在这种情况下,这不是我想要的。因此,我转向了案例陈述。但是,我不确定如何将case语句与group by一起使用。

这是我正在使用的数据:

table 1 - credit_cards table
credit_card_id
year_opened


table 2 - transactions table
transaction_id
credit_card_id - matches credit_cards.credit_card_id
transaction_status ("complete" or "incomplete")

这是我到目前为止所得到的:

SELECT 

CASE WHEN transaction_count < 10 THEN “Less than 10”
WHEN transaction_count >= 10 and transaction_count < 30 THEN “10 <= transaction count < 30”
ELSE transaction_count>=30 THEN “Greater than or equal to 30”
END as buckets

count(*) as ct.transaction_count
FROM credit_cards c
INNER JOIN transactions t
ON c.credit_card_id = t.credit_card_id
WHERE t.status = “completed”
GROUP BY v.year_opened

GROUP BY buckets
ORDER BY buckets

预期产量

credit card count | year opened | transaction count bucket
23421             | 2002        | Less than 10
etc

3 个答案:

答案 0 :(得分:1)

您可以通过指定每个容器下限的排序数组来在width_bucket中指定容器大小。

在您的情况下,它将是array[10,30]:小于10的任何东西将成为bin 0,介于10和29之间的任何东西将成为bin 1和30或更多的东西将成为bin 2。

WITH a AS (select generate_series(5,35) cnt)
SELECT  cnt, width_bucket(cnt, array[10,30]) 
FROM a;

答案 1 :(得分:0)

不确定这是否是您要寻找的东西。

year opened | Less than 10 | 10 <= transaction count < 30 | Greater than or equal to 30
2002        |  23421       |                              |

,输出将如下所示。

IntegerBinding

答案 2 :(得分:0)

要弄清楚这一点,您需要计算每张信用卡的交易次数以找出合适的存储桶,然后需要计算每年每存储桶的信用卡数目。有两种不同的方法来获得最终结果。一种方法是首先合并所有数据并计算聚合值的第一级。然后计算汇总值的最终级别:

with t1 as (
  select year_opened
     , c.credit_card_id
     , case when count(*) < 10 then 'Less than 10'
            when count(*) < 30 then 'Between [10 and 30)'
            else 'Greater than or equal to 30'
       end buckets
  from credit_cards c
  join transactions t
    on t.credit_card_id = c.credit_card_id
 where t.transaction_status = 'complete'
 group by year_opened
     , c.credit_card_id
)
select count(*) credit_card_count
     , year_opened
     , buckets
  from t1
 group by year_opened
     , buckets;

但是,在将其加入到信用卡表之前,先在交易表上计算汇总数据的第一级可能会更加出色:

select count(*) credit_card_count
     , year_opened
     , buckets
  from credit_cards c
  join (select credit_card_id
             , case when count(*) < 10 then 'Less than 10'
                    when count(*) < 30 then 'Between [10 and 30)'
                    else 'Greater than or equal to 30'
               end buckets
          from transactions
         group by credit_card_id) t
    on t.credit_card_id = c.credit_card_id
 group by year_opened
     , buckets;

如果您希望展开上述查询并使用通用表表达式,也可以这样做(我发现这样做更容易阅读/理解):

with bkt as (
  select credit_card_id
       , case when count(*) < 10 then 'Less than 10'
              when count(*) < 30 then 'Between [10 and 30)'
              else 'Greater than or equal to 30'
          end buckets
    from transactions
   group by credit_card_id
)
select count(*) credit_card_count
     , year_opened
     , buckets
  from credit_cards c
  join bkt t
    on t.credit_card_id = c.credit_card_id
 group by year_opened
     , buckets;