计算SQL公用表表达式中列的中位数

时间:2010-07-09 22:22:20

标签: sql sql-server sql-server-2008 median common-table-expression

在MSSQL2008中,我试图使用经典中值查询计算公共表表达式中一列数字的中位数,如下所示:

WITH cte AS
(
   SELECT number
   FROM table
) 

SELECT cte.*,
(SELECT 
  (SELECT (   
    (SELECT TOP 1 cte.number  
     FROM     
     (SELECT TOP 50 PERCENT cte.number     
      FROM cte
      ORDER BY cte.number) AS medianSubquery1   
    ORDER BY cte.number DESC)  
    +   
  (SELECT TOP 1 cte.number
   FROM     
    (SELECT TOP 50 PERCENT cte.number    
     FROM cte   
     ORDER BY cte.number DESC) AS medianSubquery2   
   ORDER BY cte.number ASC) ) / 2)) AS median

FROM cte
ORDER BY cte.number

我得到的结果集如下:

NUMBER    MEDIAN
x1        x1
x1        x1
x1        x1
x2        x2
x3        x3

换句话说,当我希望中间列一直是“x1”时,“中间”列与“数字”列相同。我使用类似的表达式来计算模式,它在相同的公用表表达式上工作正常。

3 个答案:

答案 0 :(得分:3)

这是一种稍微不同的方式:

WITH cte AS
(
   SELECT number
   FROM table1
)
SELECT T1.number, T3.median
FROM cte T1, 
(
    SELECT AVG(number) AS median
    FROM
    (
        SELECT number, ROW_NUMBER() OVER(ORDER BY number) AS rn
        FROM cte
    ) T2
    WHERE T2.rn = ((SELECT COUNT(*) FROM table1) + 1) / 2
    OR T2.rn = ((SELECT COUNT(*) FROM table1) + 2) / 2
) T3

答案 1 :(得分:1)

您的查询问题在于您正在执行

SELECT TOP 1 cte.number FROM...

但它与子查询无关,它与外部查询相关,因此子查询无关紧要。这就解释了为什么你最终只能得到相同的价值。删除cte.(如下所示)给出了CTE的中位数。这是一个恒定的价值。你想做什么?

WITH cte AS
    ( SELECT NUMBER
    FROM master.dbo.spt_values
    WHERE TYPE='p'
    )

SELECT cte.*,
(SELECT 
  (SELECT (   
    (SELECT TOP 1 number  
     FROM     
     (SELECT TOP 50 PERCENT cte.number     
      FROM cte
      ORDER BY cte.number) AS medianSubquery1   
    ORDER BY number DESC)  
    +   
  (SELECT TOP 1 number
   FROM     
    (SELECT TOP 50 PERCENT cte.number    
     FROM cte   
     ORDER BY cte.number DESC) AS medianSubquery2   
   ORDER BY number ASC) ) / 2)) AS median
FROM cte
ORDER BY cte.number

返回

NUMBER      median
----------- -----------
0           1023
1           1023
2           1023
3           1023
4           1023
5           1023
6           1023
7           1023

答案 2 :(得分:0)

这不是一个全新的答案,因为它主要扩展了Mark Byer的答案,但有一些选项可以进一步简化查询。

首先要真正利用CTE。您不仅可以拥有多个CTE,而且可以相互引用。考虑到这一点,我们可以创建一个额外的CTE来根据第一个结果计算中值。这封装了中值计算,并使实际的SELECT只做它需要做的事情。请注意,必须将ROW_NUMBER()移动到第一个CTE中。

;WITH cte AS
(
   SELECT number, ROW_NUMBER() OVER(ORDER BY number) AS rn
   FROM table1
),
med AS
(
    SELECT AVG(number) AS median
    FROM cte
    WHERE cte.rn = ((SELECT COUNT(*) FROM cte) + 1) / 2
    OR cte.rn = ((SELECT COUNT(*) FROM cte) + 2) / 2
)
SELECT cte.number, med.median
FROM cte
CROSS JOIN med

为了进一步降低复杂性,您可以“使用”自定义CLR聚合来处理中位数(例如http://www.SQLsharp.com/ [我是其作者]的免费SQL#库中提供的那个)。

;WITH cte AS
(
   SELECT number
   FROM table1
),
med AS
(
    SELECT  SQL#.Agg_Median(cte.number) AS median
    FROM    cte
)
SELECT cte.number, med.median
FROM cte
CROSS JOIN med