MySQL:计算派生值的中位数

时间:2016-05-07 00:17:31

标签: mysql sql median

根据标题,我试图计算派生字段的中值:

SELECT
    TIMESTAMPDIFF(DAY, T1.time1, T2.time2) as diff
FROM table1 T1 JOIN table2 T2 ON ...
WHERE ...
GROUP BY ...

计算平均值就像

一样简单
SELECT
    AVG(F.diff) as average 
FROM (
    //subquery above
) F;

然而,我一直在寻找一种计算中位数的方法,因为大多数解决方案似乎都涉及将列表连接到自身。我这样做的唯一方法是将子查询输出两次。这个子查询不是很快,所以除非有人能确认MySQL会优化冗余并且只执行一次子查询,否则我真的想避免这个解决方案。

1 个答案:

答案 0 :(得分:0)

使用group_concat()有一个技巧,但这可能不起作用(因为中间字符串长度)。更好的方法是简单地枚举行然后使用条件聚合。但是,这需要两个级别的枚举:

SELECT TIMESTAMPDIFF(DAY, T1.time1, T2.time2) as diff,
       AVG(CASE WHEN 2*@rn IN (cnt - 1, cnt, cnt + 1 THEN value END) as median
FROM (SELECT *,
             (@max := if(@g = concat_ws(':', <group by columns>>), @max,
                         if(concat_ws(':', <group by columns>>), rn, rn)
                        )
             ) as cnt 
      FROM (SELECT . . .,
                   (@rn := if(@g = concat_ws(':', <group by columns>>), @rn + 1,
                              @g := concat_ws(':', <group by columns>>), 1, 1)
                             )
                   ) as rn
            FROM table1 T1 JOIN table2 T2 ON ... CROSS JOIN
                 (SELECT @g := '', @rn := 0) params
            WHERE ...
            ORDER BY <group by columns>, <ordering column>
           ) t CROSS JOIN
           (SELECT @g1 := '', @max := -1) params
      ORDER BY <group by columns>, <ordering column desc>
     ) t
GROUP BY ...