如何获得组中“中间”值的平均值?

时间:2013-09-30 18:00:31

标签: sql postgresql

我有一个包含值和组ID的表(简化示例)。我需要获得每组中间3个值的平均值。因此,如果有1个,2个或3个值,则只是平均值。但如果有4个值,它会排除最高,5个值最高和最低等等。我在考虑某种窗函数,但我不确定它是否可能。

http://www.sqlfiddle.com/#!11/af5e0/1

对于这些数据:

TEST_ID TEST_VALUE  GROUP_ID
1       5           1
2       10          1
3       15          1
4       25          2
5       35          2
6       5           2
7       15          2
8       25          3
9       45          3
10      55          3
11      15          3
12      5           3
13      25          3
14      45          4

我想要

GROUP_ID    AVG
1           10
2           15
3           21.6
4           45

5 个答案:

答案 0 :(得分:6)

使用分析函数的另一种选择;

SELECT group_id,
       avg( test_value )
FROM (
  select t.*,
         row_number() over (partition by group_id order by test_value ) rn,
         count(*) over (partition by group_id  ) cnt
  from test t
) alias 
where 
   cnt <= 3
   or 
   rn between floor( cnt / 2 )-1 and ceil( cnt/ 2 ) +1
group by group_id
;

演示 - &gt; http://www.sqlfiddle.com/#!11/af5e0/59

答案 1 :(得分:2)

我不熟悉窗口函数的Postgres语法,但我能够使用SQL Fiddle在SQL Server中解决您的问题。也许您可以轻松地将其迁移到与Postgres兼容的代码中。希望它有所帮助!

关于我如何使用它的快速入门。

  1. 订购每组的考试成绩
  2. 获取每组中的项目数
  3. 将其用作子查询并仅选择中间3项(即外部查询中的where子句)
  4. 获取每组的平均值
  5. -

    select  
      group_id,
      avg(test_value)
    from (
      select 
        t.group_id, 
        convert(decimal,t.test_value) as test_value, 
        row_number() over (
          partition by t.group_id
          order by t.test_value
        ) as ord,
        g.gc
      from
        test t
        inner join (
          select group_id, count(*) as gc
          from test
          group by group_id
        ) g
          on t.group_id = g.group_id
      ) a
    where
      ord >= case when gc <= 3 then 1 when gc % 2 = 1 then gc / 2 else (gc - 1) / 2 end
      and ord <= case when gc <= 3 then 3 when gc % 2 = 1 then (gc / 2) + 2 else ((gc - 1) / 2) + 2 end
    group by
      group_id
    

答案 2 :(得分:2)

with cte as (
    select
        *,
        row_number() over(partition by group_id order by test_value) as rn,
        count(*) over(partition by group_id) as cnt
    from test
)
select
    group_id, avg(test_value)
from cte
where
    cnt <= 3 or
    (rn >= cnt / 2 - 1 and rn <= cnt / 2 + 1)
group by group_id
order by group_id

<强> sql fiddle demo

在cte中,我们需要按window function计算每个group_id上的元素数量,并计算每个group_id内的row_number。然后,如果这个计数> 3然后我们需要通过将计数除以2得到组的中间值,然后得到+1和-1元素。如果count&lt; = 3,那么我们应该采用所有元素。

答案 3 :(得分:1)

这有效:

SELECT A.group_id, avg(A.test_value) AS avg_mid3 FROM
  (SELECT group_id,
         test_value,
         row_number() OVER (PARTITION BY group_id ORDER BY test_value) AS position
      FROM test) A
JOIN
  (SELECT group_id,
         CASE
           WHEN count(*) < 4 THEN 1
           WHEN count(*) % 2 = 0 THEN (count(*)/2 - 1)
           ELSE (count(*) / 2)
         END AS position_start,
         CASE
           WHEN count(*) < 4 THEN count(*)
           WHEN count(*) % 2 = 0 THEN (count(*)/2 + 1)
           ELSE (count(*) / 2 + 2)
         END AS position_end
         FROM test GROUP BY group_id) B
  ON A.group_id=B.group_id 
  AND A.position >= B.position_start 
  AND A.position <= B.position_end
GROUP BY A.group_id

小提琴链接:http://www.sqlfiddle.com/#!11/af5e0/56

答案 4 :(得分:0)

如果您需要计算组的平均值,那么您可以这样做:

SELECT CASE WHEN NUMBER_FIRST_GROUP <> 0 
               THEN SUM_FIRST_GROUP / NUMBER_FIRST_GROUP 
               ELSE NULL
       END AS AVG_FIRST_GROUP,
       CASE WHEN NUMBER_SECOND_GROUP <> 0 
               THEN SUM_SECOND_GROUP / NUMBER_SECOND_GROUP 
               ELSE NULL
       END AS AVG_SECOND_GROUP,
       CASE WHEN NUMBER_THIRD_GROUP <> 0 
               THEN SUM_THIRD_GROUP / NUMBER_THIRD_GROUP 
               ELSE NULL
       END AS AVG_THIRD_GROUP,
       CASE WHEN NUMBER_FOURTH_GROUP <> 0 
               THEN SUM_FOURTH_GROUP / NUMBER_FOURTH_GROUP 
               ELSE NULL
       END AS AVG_FOURTH_GROUP
FROM (
      SELECT 
         SUM(CASE WHEN GROUP_ID = 1 THEN 1 ELSE 0 END) AS NUMBER_FIRST_GROUP,
         SUM(CASE WHEN GROUP_ID = 1 THEN TEST_VALUE ELSE 0 END) AS SUM_FIRST_GROUP,
         SUM(CASE WHEN GROUP_ID = 2 THEN 1 ELSE 0 END) AS NUMBER_SECOND_GROUP,
         SUM(CASE WHEN GROUP_ID = 2 THEN TEST_VALUE ELSE 0 END) AS SUM_SECOND_GROUP,
         SUM(CASE WHEN GROUP_ID = 3 THEN 1 ELSE 0 END) AS NUMBER_THIRD_GROUP,
         SUM(CASE WHEN GROUP_ID = 3 THEN TEST_VALUE ELSE 0 END) AS SUM_THIRD_GROUP,
         SUM(CASE WHEN GROUP_ID = 4 THEN 1 ELSE 0 END) AS NUMBER_FOURTH_GROUP,
         SUM(CASE WHEN GROUP_ID = 4 THEN TEST_VALUE ELSE 0 END) AS SUM_FOURTH_GROUP
     FROM TEST
     ) AS FOO