从结果中分组并排除最小值和最大值

时间:2014-07-25 00:23:18

标签: sql database oracle select

我的数据类似于以下内容:

Date        ID          Amount
10-Jun-14   978500302   163005350
17-Jun-14   978500302   159947117
24-Jun-14   978500302   159142342
1-Jul-14    978500302   159623201
8-Jul-14    978500302   143066033
14-Jul-14   978500302   145852027
15-Jul-14   978500302   148595751

在oracle中是否有办法可以得到这些数据的平均值,这不包括最高值和最低值?我可以通过GROUP BY ID然后AVG(Amount)来获得总体平均值。但是如何在排除min和max时这样做呢?

4 个答案:

答案 0 :(得分:6)

最简单的方法是使用分析函数在聚合之前获取最小值和最大值:

select id, avg(amount)
from (select d.*,
             min(amount) over (partition by id) as mina,
             max(amount) over (partition by id) as maxa
      from data d
     ) d
where amount > mina and amount < maxa
group by id;

答案 1 :(得分:1)

select id, avg(amount)
  from tbl t
 where id not in (select id from tbl group by id having count(distinct amount) > 2)
    or (amount <> (select min(x.amount) from tbl x where x.id = t.id)
   and amount <> (select max(x.amount) from tbl x where x.id = t.id))
 group by id

WHERE子句中的第一行是保留ID不超过2的ID。否则他们将被排除在结果之外。

如果您希望将它们排除在外,您可以摆脱该行。

您还可以尝试以下方法:

select id, avg(amount)
  from (select id, amount
          from tbl
        minus
        select id, min(amount)
          from tbl
         group by id
        minus
        select id, max(amount)
          from tbl
         group by id)
 group by id

答案 2 :(得分:1)

尝试这样的事情

SELECT ID, 
(SUM(Amount)-Min(Amount)-Max(Amount))/(COUNT(Amount)-2) AS AVG
FROM yourTbl
Group By ID

正如@ Clockwork-Muse所指出的,只有在行数大于2时才会起作用。

答案 3 :(得分:0)

基于评论说“无论如何都需要删除一分钟和一分钟”,只有当最小或最大金额重复时,戈登方法的变体才会排除一行:

select id, avg(amount)
from (
  select id, amount,
    row_number() over (partition by id order by amount) as min_rn,
    row_number() over (partition by id order by amount desc) as max_rn
  from t42
)
where min_rn > 1
and max_rn > 1
group by id;

如果你使用rank()dense_rank()而不是row_number(),那么这与Gordon的行为相同,因为它们允许重复结果。这些中的任何一个只会碰到真正的一次。

要查看排名的工作原理,请重复最小和最高金额:

select dt, id, amount,
  row_number() over (partition by id order by amount) as min_rn,
  row_number() over (partition by id order by amount desc) as max_rn,
  rank() over (partition by id order by amount) as min_rnk,
  rank() over (partition by id order by amount desc) as max_rnk
from t42;

DT                ID     AMOUNT     MIN_RN     MAX_RN    MIN_RNK    MAX_RNK
--------- ---------- ---------- ---------- ---------- ---------- ----------
23-JUL-14  978500302  143066033          1          8          1          8 
08-JUL-14  978500302  143066033          2          9          1          8 
14-JUL-14  978500302  145852027          3          7          3          7 
15-JUL-14  978500302  148595751          4          6          4          6 
24-JUN-14  978500302  159142342          5          5          5          5 
01-JUL-14  978500302  159623201          6          4          6          4 
17-JUN-14  978500302  159947117          7          3          7          3 
24-JUL-14  978500302  163005350          8          2          8          1 
10-JUN-14  978500302  163005350          9          1          8          1 

使用相同的数据,使用`row_number()给出:

        ID AVG(AMOUNT)
---------- -----------
 978500302   154175974 

使用rank()给出:

        ID AVG(AMOUNT)
---------- -----------
 978500302   154632088 

SQL Fiddle

对于只有1或2个金额的ID,您将得不到任何结果,这可能是您想要发生的,并且可能对于一致性有意义。如果你确实想要允许这种可能性,并且当这些是唯一的值时包括最小值和最大值,你还可以计算每个ID有多少原始行,如果没有足够的话,包括最小值和最大值:< / p>

select id, avg(amount)
from (
  select id, amount,
    count(*) over (partition by id) as cnt,
    row_number() over (partition by id order by amount) as min_rn,
    row_number() over (partition by id order by amount desc) as max_rn
  from t42
)
where (cnt < 3 or min_rn > 1)
and (cnt < 3 or max_rn > 1)
group by id;