聚合函数或GROUP BY子句在嵌套查询中很复杂

时间:2016-11-19 21:34:25

标签: sql sql-server

我这个例子我需要根据优先级获得一份工作/任务百分比列表:

Priority    Percentages
-----------------------
 1            %25
11            %10

task_events表是:

task_events_id  time    missing_info    job_id  task_index  machine_id  event_type  user    scheduling_class    priority    cpu_request memory_request  disk_space_request  different_machines_restriction

job_id并且它的任务可以在多行中,因此我创建了新的列task_events_id作为PK,用于嵌套选择以获取每个作业和任务的信号行。然后应用此结果以获得每个作业的优先级。我提出了这个问题。这里的主要概念是,我有11个优先级。优先有很多工作。每个作业都分配给一个优先级。

Select 
    tes.[priority], (tes.total_priority * 100 / (select sum(tes.total_priority)from tes )) as [percentage]
From
    (select 
         [priority], count(*) as total_priority 
     from 
         task_events as t
     inner join
         (select 
              max(task_events_id) as maxid, 1 as total 
          from 
              task_events 
          group by 
              job_id, task_index) as te on t.task_events_id = te.maxid
     group by 
         [priority]) as tes 
group by 
    tes.[priority]

这是我提出的最好的,但总的来说是复杂的,任何建议

使用此查询我收到此错误:

无效的对象名称' tes'

虽然错误地把'tes.total_priority'在最后一组由。

2 个答案:

答案 0 :(得分:0)

如果,对于每个优先级,您想要它的总百分比,那么使用窗口函数:

select [priority], count(*) as total_priority,
       count(*) * 1.0 / sum(count(*)) over () as ratio
from task_events t
group by [priority];

答案 1 :(得分:0)

实际上,您的查询只需要在上一个group by中使用其他字段即可...添加tes.total_priority .. 虽然没有工作 - 代码被移除

-- Code removed 

但您可以通过删除inner join并使用distinct代替

来考虑这个更简单的查询
Select 
    tes.[priority], (tes.total_priority * 100.0 / sum(tes.total_priority)) as [percentage]
From
    (select
         [priority], count(*) as total_priority 
     from 
         (select  
             distinct [priority], job_id, task_index
         from task_events) as t
       ) as tes
group by 
    tes.[priority], tes.total_priority 

或更简单的使用窗口函数

select 
    [priority], count(*) as total_priority,
    count(*) * 100.0 / sum(count(*)) over () as ratio
from (select distinct [priority], job_id, task_index
         from task_events) as t
group by [priority];

以及最后一个,如果您需要计算job_id,可以从子查询中删除task_index ..然后可以通过将count(*)替换为{{1}来删除子查询} ..但结果可能与以前的查询不同,取决于数据和哪一个是正确的?这取决于真正想从这些数据中实现的目标......但它只是建议......

count(distinct job_id)