计算SQL中位数

时间:2015-01-21 16:54:08

标签: sql sql-server sql-server-2008

我试图从这个解决方案中实现中位数(除其他外,但这似乎是最简单的中位数代码):Function to Calculate Median in Sql Server

但是,我的申请有困难。这是我当前的SQL查询。我的目标是在给定的周,月和部门找到TotalTimeOnCall的中位数CallerComplaintTypeID。我认为我最大的问题是,我从根本上不了解如何应用这个中位数函数来实现我的结果。

例如,如果我需要平均值,我可以将ORDER BY更改为GROUP BY,然后再拍一个AVG(TotalTimeOnCall)。我如何使用这个Median解决方案来实现这个想法呢?

这是"原始数据"查询:

WITH rawData as (
SELECT 
    DepartmentName
    ,MONTH(PlacedOnLocal) AS MonthNumber
    ,CASE 
    WHEN Datepart(day, PlacedOnLocal) < 8 THEN '1' 
    WHEN Datepart(day, PlacedOnLocal) < 15 THEN '2' 
    WHEN Datepart(day, PlacedOnLocal) < 22 THEN '3' 
    WHEN Datepart(day, PlacedOnLocal) < 29 THEN '4' 
    ELSE '5' 
    END AS WeekNumber
    ,CallerComplaintTypeID
    ,TotalTimeOnCall
FROM [THE_RELEVANT_TABLE]
WHERE PlacedOnLocal BETWEEN '2014-09-01' AND '2014-12-31'
    AND CallerComplaintTypeID IN (5,89,9,31,203)
    AND TotalTimeOnCall IS NOT NULL
)
SELECT 
DepartmentName,
MonthNumber,
WeekNumber,
CallerComplaintTypeID,
TotalTimeOnCall
FROM
rawData
ORDER BY DepartmentName, MonthNumber, WeekNumber, CallerComplaintTypeID

使用此示例输出:

DepartmentName  MonthNumber WeekNumber  CallerComplaintTypeID   TotalTimeOnCall
Dept_01     9   1   5   654
Dept_01     9   1   5   156
Dept_01     9   1   5   21
Dept_01     9   1   5   67
Dept_01     9   1   5   13
Dept_01     9   1   5   97
Dept_01     9   1   5   87
Dept_01     9   1   5   16

这是上面的中位数解决方案:

SELECT
(
    (
        SELECT MAX(TotalTimeOnCall)
        FROM
            (
                SELECT TOP 50 PERCENT TotalTimeOnCall
                FROM rawData
                WHERE TotalTimeOnCall IS NOT NULL
                ORDER BY TotalTimeOnCall
            ) AS BottomHalf
    )
    +
    (
        SELECT MIN(TotalTimeOnCall)
        FROM
            (
                SELECT TOP 50 PERCENT TotalTimeOnCall
                FROM rawData
                WHERE TotalTimeOnCall IS NOT NULL
                ORDER BY TotalTimeOnCall DESC
            ) AS TopHalf
    )
) / 2 AS Median

1 个答案:

答案 0 :(得分:0)

这是一个简单的中位数解决方案,可让您获得每组的中位数。

-- Example of how to get median from a set of data
;with cte_my_query as (
    -- this cte simulates the query that would return your data
    select '2016-01-01' as dt, 1 as val
    union 
    select '2016-01-01' as dt, 10 as val
    union 
    select '2016-01-01' as dt, 7 as val
    union 
    select '2016-01-01' as dt, 16 as val
    union 
    select '2016-01-01' as dt, 11 as val
    union 
    select '2016-01-01' as dt, 2 as val
    union
    select '2016-01-01' as dt, 5 as val
    union 
    select '2016-01-02' as dt, 6 as val
    union 
    select '2016-01-02' as dt, 13 as val
    union 
    select '2016-01-02' as dt, 7 as val
    union   
    select '2016-01-02' as dt, 9 as val
    union   
    select '2016-01-02' as dt, 18 as val
)
,cte_dates as (
    -- get the distinct key we want to get median for
    select distinct dt from cte_my_query
)
select  dt, median.val
from    cte_dates
    cross apply (
        -- of the top 50% (below), take the top 1, desc, which is the median value
        select top 1 val from (
            -- for each date, get the top 50% of the values
            select top 50 percent val
            from cte_my_query
            where cte_dates.dt = cte_my_query.dt
            order by dt
        ) as inner_median
        order by inner_median.val desc
    ) median