假设我有一个这样的表:
CampaignId Category Strike
1 A 2
1 B 3
1 Others 5
2 A 4
2 B 2
3 C 1
3 C 4
4 A 1
4 B 1
4 C 1
4 D 1
4 Others 1
然后,我会按Strike
计算每个Category
CampaignId
的百分比,如下所示:
SELECT CampaignId, Category, Strike, (SUM(Strike::FLOAT) OVER (PARTITION BY CampaignId) / SUM(Strike::FLOAT) OVER (PARTITION BY CampaignId, Category) * 100) AS PercentageOfStrikesByCategoryByCampaignId
FROM myTable
产生下面的中间表:
CampaignId Category Strike PercentageOfStrikesByCategoryByCampaignId
1 A 2 20.0
1 B 3 30.0
1 Others 5 50.0
2 A 4 66.6
2 B 2 33.3
3 C 1 20.0
3 C 4 80.0
4 A 1 20.0
4 B 1 20.0
4 C 1 20.0
4 D 1 20.0
4 Others 1 20.0
现在,我想根据上面计算的FinalCategory
分配最终标签,说PercentageOfStrikesByCategoryByCampaignId
。 FinalCategory
条件的要点是:如果每个CampaignId
中的一个类别是'其他' AND
为PercentageOfStrikesByCategoryByCampaignId >= 30.0
,然后CampaignId
组中的其余行将标记为'其他'。否则,我们会将Category
直接复制到FinalCategory
。结果表应如下所示:
CampaignId Category Strike PercentageOfStrikesByCategoryByCampaignId FinalCategory
1 A 2 20.0 Others
1 B 3 30.0 Others
1 Others 5 50.0 Others
2 A 4 66.6 A
2 B 2 33.3 B
3 C 1 20.0 C
3 C 4 80.0 C
4 A 1 20.0 A
4 B 1 20.0 B
4 C 1 20.0 C
4 D 1 20.0 D
4 Others 1 20.0 Others
我怎样才能使用尽可能简单的SQL查询来实现这样的功能?提前感谢您的帮助!
答案 0 :(得分:1)
SELECT CampaignId, Category, Strike, PercentageOfStrikesByCategoryByCampaignId,
CASE WHEN Others_count > 0 AND
MAX(CASE WHEN Category='Others' THEN PercentageOfStrikesByCategoryByCampaignId END) OVER (PARTITION BY CampaignId) >= 30 THEN 'Others'
ELSE Category END AS FinalCategory
FROM (
SELECT CampaignId, Category, Strike,
(SUM(Strike::FLOAT) OVER (PARTITION BY CampaignId)
/ SUM(Strike::FLOAT) OVER (PARTITION BY CampaignId, Category) * 100) AS PercentageOfStrikesByCategoryByCampaignId
,SUM(CASE WHEN Category='Others' THEN 1 ELSE 0 END) OVER (PARTITION BY CampaignId) as Others_count
FROM myTable
) T
添加到现有查询
sum
窗口功能的每个campaignId的Others_Count case
窗口函数的max
表达式检查Others
类别的行是否具有百分比> = 30并指定'其他'作为最终类别,否则使用按原样分类。答案 1 :(得分:1)
让我们以查询作为CTE或子查询开始:
WITH t as (
SELECT CampaignId, Category, Strike,
(SUM(Strike::FLOAT) OVER (PARTITION BY CampaignId) / SUM(Strike::FLOAT) OVER (PARTITION BY CampaignId, Category) * 100) AS PercentageOfStrikesByCategoryByCampaignId
FROM myTable
)
select t.*,
(case when OthersFlag = 1 then 'Others' else category end) as FinalCategory
from (select t.*,
sum(case when category = 'Others' and PercentageOfStrikesByCategoryByCampaignId > 30.0 then 1 else 0 end) over
(partition by campaignid) as OthersFlag
from t
) t;