如何查找按年份分配的信用卡,以及如何完成交易。将这些信用卡分为三类:少于10笔交易,10到30笔交易之间,超过30笔交易?
我尝试使用的第一种方法是在PostgresQL中使用width_buckets函数,但是文档说它只能创建等距的存储桶,在这种情况下,这不是我想要的。因此,我转向了案例陈述。但是,我不确定如何将case语句与group by一起使用。
这是我正在使用的数据:
table 1 - credit_cards table
credit_card_id
year_opened
table 2 - transactions table
transaction_id
credit_card_id - matches credit_cards.credit_card_id
transaction_status ("complete" or "incomplete")
这是我到目前为止所得到的:
SELECT
CASE WHEN transaction_count < 10 THEN “Less than 10”
WHEN transaction_count >= 10 and transaction_count < 30 THEN “10 <= transaction count < 30”
ELSE transaction_count>=30 THEN “Greater than or equal to 30”
END as buckets
count(*) as ct.transaction_count
FROM credit_cards c
INNER JOIN transactions t
ON c.credit_card_id = t.credit_card_id
WHERE t.status = “completed”
GROUP BY v.year_opened
GROUP BY buckets
ORDER BY buckets
预期产量
credit card count | year opened | transaction count bucket
23421 | 2002 | Less than 10
etc
答案 0 :(得分:1)
您可以通过指定每个容器下限的排序数组来在width_bucket
中指定容器大小。
在您的情况下,它将是array[10,30]
:小于10的任何东西将成为bin 0,介于10和29之间的任何东西将成为bin 1和30或更多的东西将成为bin 2。
WITH a AS (select generate_series(5,35) cnt)
SELECT cnt, width_bucket(cnt, array[10,30])
FROM a;
答案 1 :(得分:0)
不确定这是否是您要寻找的东西。
year opened | Less than 10 | 10 <= transaction count < 30 | Greater than or equal to 30
2002 | 23421 | |
,输出将如下所示。
IntegerBinding
答案 2 :(得分:0)
要弄清楚这一点,您需要计算每张信用卡的交易次数以找出合适的存储桶,然后需要计算每年每存储桶的信用卡数目。有两种不同的方法来获得最终结果。一种方法是首先合并所有数据并计算聚合值的第一级。然后计算汇总值的最终级别:
with t1 as (
select year_opened
, c.credit_card_id
, case when count(*) < 10 then 'Less than 10'
when count(*) < 30 then 'Between [10 and 30)'
else 'Greater than or equal to 30'
end buckets
from credit_cards c
join transactions t
on t.credit_card_id = c.credit_card_id
where t.transaction_status = 'complete'
group by year_opened
, c.credit_card_id
)
select count(*) credit_card_count
, year_opened
, buckets
from t1
group by year_opened
, buckets;
但是,在将其加入到信用卡表之前,先在交易表上计算汇总数据的第一级可能会更加出色:
select count(*) credit_card_count
, year_opened
, buckets
from credit_cards c
join (select credit_card_id
, case when count(*) < 10 then 'Less than 10'
when count(*) < 30 then 'Between [10 and 30)'
else 'Greater than or equal to 30'
end buckets
from transactions
group by credit_card_id) t
on t.credit_card_id = c.credit_card_id
group by year_opened
, buckets;
如果您希望展开上述查询并使用通用表表达式,也可以这样做(我发现这样做更容易阅读/理解):
with bkt as (
select credit_card_id
, case when count(*) < 10 then 'Less than 10'
when count(*) < 30 then 'Between [10 and 30)'
else 'Greater than or equal to 30'
end buckets
from transactions
group by credit_card_id
)
select count(*) credit_card_count
, year_opened
, buckets
from credit_cards c
join bkt t
on t.credit_card_id = c.credit_card_id
group by year_opened
, buckets;