Question

我这个例子我需要根据优先级获得一份工作/任务百分比列表：

Priority    Percentages
-----------------------
 1            %25
11            %10

task_events表是：

task_events_id  time    missing_info    job_id  task_index  machine_id  event_type  user    scheduling_class    priority    cpu_request memory_request  disk_space_request  different_machines_restriction

job_id并且它的任务可以在多行中，因此我创建了新的列task_events_id作为PK，用于嵌套选择以获取每个作业和任务的信号行。然后应用此结果以获得每个作业的优先级。我提出了这个问题。这里的主要概念是，我有11个优先级。优先有很多工作。每个作业都分配给一个优先级。

Select 
    tes.[priority], (tes.total_priority * 100 / (select sum(tes.total_priority)from tes )) as [percentage]
From
    (select 
         [priority], count(*) as total_priority 
     from 
         task_events as t
     inner join
         (select 
              max(task_events_id) as maxid, 1 as total 
          from 
              task_events 
          group by 
              job_id, task_index) as te on t.task_events_id = te.maxid
     group by 
         [priority]) as tes 
group by 
    tes.[priority]

这是我提出的最好的，但总的来说是复杂的，任何建议

使用此查询我收到此错误：

无效的对象名称＆＃39; tes＆＃39;

虽然错误地把'tes.total_priority＆＃39;在最后一组由。

Answer 1

如果，对于每个优先级，您想要它的总百分比，那么使用窗口函数：

select [priority], count(*) as total_priority,
       count(*) * 1.0 / sum(count(*)) over () as ratio
from task_events t
group by [priority];

Answer 2

实际上，您的查询只需要在上一个group by中使用其他字段即可...添加tes.total_priority .. 虽然没有工作 - 代码被移除

-- Code removed

但您可以通过删除inner join并使用distinct代替

来考虑这个更简单的查询

Select 
    tes.[priority], (tes.total_priority * 100.0 / sum(tes.total_priority)) as [percentage]
From
    (select
         [priority], count(*) as total_priority 
     from 
         (select  
             distinct [priority], job_id, task_index
         from task_events) as t
       ) as tes
group by 
    tes.[priority], tes.total_priority

或更简单的使用窗口函数

select 
    [priority], count(*) as total_priority,
    count(*) * 100.0 / sum(count(*)) over () as ratio
from (select distinct [priority], job_id, task_index
         from task_events) as t
group by [priority];

以及最后一个，如果您需要计算job_id，可以从子查询中删除task_index ..然后可以通过将count(*)替换为{{1}来删除子查询} ..但结果可能与以前的查询不同，取决于数据和哪一个是正确的？这取决于真正想从这些数据中实现的目标......但它只是建议......

count(distinct job_id)

聚合函数或GROUP BY子句在嵌套查询中很复杂

2 个答案: