SQL连接查询中位数

时间:2018-03-01 19:19:50

标签: sql postgresql

我创建了一个包含两个连接表的查询,我希望从一列中返回中位数。

查询如下所示:

select table1.column1, count(distinct(table2.column2)) from table1 left join table2 on table1.column1 = table2.column4 where column3 = 1 group by table1.column1

结果看起来像这样(有更多行的c):

| column1 | column2 | +------------+---------+ | 111 | 4 | | 222 | 5 | | 333 | 5 | | 444 | 5 |

我想从第2列结果中提取中位数。

有没有办法在没有对此查询进行重大修改的情况下执行此操作?

2 个答案:

答案 0 :(得分:0)

您可以使用percentile_disc()

select percentile_disc(0.5) over (order by cnt)
from (select table1.column1, count(distinct table2.column2) as cnt
      from table1 left join
           table2
           on table1.column1 = table2.column4
      where column3 = 1
      group by table1.column1
     ) t

答案 1 :(得分:-1)

请创建以下函数以获得中位数:

CREATE OR REPLACE FUNCTION _final_median(NUMERIC[])
   RETURNS NUMERIC AS
$$
   SELECT AVG(val)
   FROM (
     SELECT val
     FROM unnest($1) val
     ORDER BY 1
     LIMIT  2 - MOD(array_upper($1, 1), 2)
     OFFSET CEIL(array_upper($1, 1) / 2.0) - 1
   ) sub;
$$
LANGUAGE 'sql' IMMUTABLE;

CREATE AGGREGATE median(NUMERIC) (
  SFUNC=array_append,
  STYPE=NUMERIC[],
  FINALFUNC=_final_median,
  INITCOND='{}'
);

使用示例:SELECT median(num_value) AS median_value FROM t;

基于以下问题,具体针对您:

select t.*,median(column2) as median_value
from (
      select table1.column1, count(distinct(table2.column2)) as column2
      from table1 left join 
      table2 on table1.column1 = table2.column4
      where column3 = 1 
      group by table1.column1
     ) t

参考:https://wiki.postgresql.org/wiki/Aggregate_Median

更多示例:How do i get min, median and max from my query in postgresql