我有一个关于SQL查询的问题,可以在" plain" SQL,但我确信我需要使用一些组连接(不能使用MySQL)所以第二个选项是ORACLE方言,因为会有Oracle数据库。我们假设我们有以下实体:
表:兽医访问
Visit_Id,
Animal_id,
Veterinarian_id,
Sickness_code
让我们说有100次访问(100次visit_id),每次animal_id访问次数约为20次。
我需要创建一个SELECT
,按Animal_id分组,有3列
怎么做?第一列和第二列很容易,但第三列?我知道我需要使用Oracle的LISTAGG,OVER PARTITION BY,COUNT和RANK,我试图把它绑在一起,但没有像我预期的那样解决:(这个查询应该怎么样?
答案 0 :(得分:1)
我认为最自然的方式是使用两个级别的聚合,以及一些窗口函数:
select vas.animal,
sum(case when sickness_code = 5 then cnt else 0 end) as numflu,
listagg(case when seqnum <= 3 then sickness_code end, ',') within group (order by seqnum) as top3sicknesses
from (select animal, sickness_code, count(*) as cnt,
row_number() over (partition by animal order by count(*) desc) as seqnum
from visits
group by animal, sickness_code
) vas
group by vas.animal;
这使用listagg()
忽略NULL
值的事实。
答案 1 :(得分:1)
此处示例数据
create table VET as
select
rownum+1 Visit_Id,
mod(rownum+1,5) Animal_id,
cast(NULL as number) Veterinarian_id,
trunc(10*dbms_random.value)+1 Sickness_code
from dual
connect by level <=100;
查询
基本上子查询执行以下操作:
总计数和计算流感计数(在动物的所有记录中)
计算RANK(如果你真的只需要3条记录,请使用ROW_NUMBER - 见下面的讨论)
过滤前三名RANK
LISTAGGregate结果
with agg as (
select Animal_id, Sickness_code, count(*) cnt,
sum(case when SICKNESS_CODE = 5 then 1 else 0 end) over (partition by animal_id) as cnt_flu
from vet
group by Animal_id, Sickness_code
), agg2 as (
select ANIMAL_ID, SICKNESS_CODE, CNT, cnt_flu,
rank() OVER (PARTITION BY ANIMAL_ID ORDER BY cnt DESC) rnk
from agg
), agg3 as (
select ANIMAL_ID, SICKNESS_CODE, CNT, CNT_FLU, RNK
from agg2
where rnk <= 3
)
select
ANIMAL_ID, max(CNT_FLU) CNT_FLU,
LISTAGG(SICKNESS_CODE||'('||CNT||')', ', ') WITHIN GROUP (ORDER BY rnk) as cnt_lts
from agg3
group by ANIMAL_ID
order by 1;
给出
ANIMAL_ID CNT_FLU CNT_LTS
---------- ---------- ---------------------------------------------
0 1 6(5), 1(4), 9(3)
1 1 1(5), 3(4), 2(3), 8(3)
2 0 1(5), 10(3), 4(3), 6(3), 7(3)
3 1 5(4), 2(3), 4(3), 7(3)
4 1 2(5), 10(4), 1(2), 3(2), 5(2), 7(2), 8(2)
我故意向Sickness_code(计数访问)展示前3名可以有你应该处理的关系。
检查RANK功能。在这种情况下,使用ROW_NUMBER
不是确定性的。