我在计算结果集的中位数时遇到问题,可以使用一些帮助。我需要提供中位数,最大值,最小值,平均值和标准差。有222行可以或多或少,我不知道我到目前为止是一个计算中位数的准确方法。这是我的询问。
Select
min(nodes) as min_nodes
,max(nodes) as max_nodes
,avg(nodes) as avg_nodes
,max(nodes) + min(nodes))/2 as median_nodes
,stddev(nodes) as sd_nodes
from Table
答案 0 :(得分:2)
您可以使用窗口功能执行此操作:
Select min(nodes) as min_nodes, max(nodes) as max_nodes, avg(nodes) as avg_nodes,
avg(case when 2*seqnum in (cnt, cnt - 1, cnt + 1) then nodes end) as median_nodes,
stddev(nodes) as sd_nodes
from (select t.*, row_number() over (order by nodes) as seqnum,
count(*) over () as cnt
from table t
) t
使用avg()
是为了处理具有偶数个值的情况。在这种情况下,传统上将中位数分配给两个中间值的中点。
答案 1 :(得分:1)
这是计算中位数的一种方法:
select avg(nodes)
from (
select nodes
, row_number() over(order by nodes asc) as rn1
, row_number() over(order by nodes desc) as rn2
from table
) as x(nodes, rn1, rn2)
where rn1 in (rn2, rn2 - 1, rn2 + 1)
在两个方向上枚举节点是一种优化。