在SQL Server中计算中值的函数

时间:2009-08-27 18:24:33

标签: sql sql-server aggregate-functions median

根据MSDN,Median在Transact-SQL中不能作为聚合函数使用。但是,我想知道是否可以创建此功能(使用Create Aggregate函数,用户定义函数或其他方法)。

执行此操作的最佳方式(如果可能) - 允许在聚合查询中计算中值(假设数字数据类型)?

34 个答案:

答案 0 :(得分:182)

如果您使用的是SQL 2005或更高版本,这对于表中的单个列来说是一个很好的,简单的中位数计算:

SELECT
(
 (SELECT MAX(Score) FROM
   (SELECT TOP 50 PERCENT Score FROM Posts ORDER BY Score) AS BottomHalf)
 +
 (SELECT MIN(Score) FROM
   (SELECT TOP 50 PERCENT Score FROM Posts ORDER BY Score DESC) AS TopHalf)
) / 2 AS Median

答案 1 :(得分:124)

有很多方法可以做到这一点,性能会有很大变化。这是一个特别优化的解决方案,来自 Medians, ROW_NUMBERs, and performance 。当涉及到执行期间生成的实际I / O时,这是一个特别优化的解决方案 - 它看起来比其他解决方案更昂贵,但它实际上要快得多。

该页面还包含对其他解决方案和性能测试详细信息的讨论。请注意,如果有多行具有相同的中间列值,则使用唯一列作为消除歧义。

与所有数据库性能方案一样,总是尝试使用真实硬件上的实际数据测试解决方案 - 您永远不知道对SQL Server优化程序的更改或环境中的特性何时会使正常快速的解决方案变慢。 / p>

SELECT
   CustomerId,
   AVG(TotalDue)
FROM
(
   SELECT
      CustomerId,
      TotalDue,
      -- SalesOrderId in the ORDER BY is a disambiguator to break ties
      ROW_NUMBER() OVER (
         PARTITION BY CustomerId
         ORDER BY TotalDue ASC, SalesOrderId ASC) AS RowAsc,
      ROW_NUMBER() OVER (
         PARTITION BY CustomerId
         ORDER BY TotalDue DESC, SalesOrderId DESC) AS RowDesc
   FROM Sales.SalesOrderHeader SOH
) x
WHERE
   RowAsc IN (RowDesc, RowDesc - 1, RowDesc + 1)
GROUP BY CustomerId
ORDER BY CustomerId;

答案 2 :(得分:73)

在SQL Server 2012中,您应该使用PERCENTILE_CONT

SELECT SalesOrderID, OrderQty,
    PERCENTILE_CONT(0.5) 
        WITHIN GROUP (ORDER BY OrderQty)
        OVER (PARTITION BY SalesOrderID) AS MedianCont
FROM Sales.SalesOrderDetail
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY SalesOrderID DESC

另请参阅:http://blog.sqlauthority.com/2011/11/20/sql-server-introduction-to-percentile_cont-analytic-functions-introduced-in-sql-server-2012/

答案 3 :(得分:21)

我原来的快速回答是:

select  max(my_column) as [my_column], quartile
from    (select my_column, ntile(4) over (order by my_column) as [quartile]
         from   my_table) i
--where quartile = 2
group by quartile

这将一举给你中位数和四分位数范围。如果你真的只想要一行是中位数,那么取消注释where子句。

当你坚持使用解释计划时,60%的工作是对数据进行排序,这在计算这样的位置相关统计数据时是不可避免的。

我修改了答案,遵循RobertŠevčík-Robajz在以下评论中的出色建议:

;with PartitionedData as
  (select my_column, ntile(10) over (order by my_column) as [percentile]
   from   my_table),
MinimaAndMaxima as
  (select  min(my_column) as [low], max(my_column) as [high], percentile
   from    PartitionedData
   group by percentile)
select
  case
    when b.percentile = 10 then cast(b.high as decimal(18,2))
    else cast((a.low + b.high)  as decimal(18,2)) / 2
  end as [value], --b.high, a.low,
  b.percentile
from    MinimaAndMaxima a
  join  MinimaAndMaxima b on (a.percentile -1 = b.percentile) or (a.percentile = 10 and b.percentile = 10)
--where b.percentile = 5

当您拥有偶数个数据项时,这应计算正确的中位数和百分位数值。同样,如果您只想要中位数而不是整个百分位分布,请取消注释最终的where子句。

答案 4 :(得分:17)

更好:

SELECT @Median = AVG(1.0 * val)
FROM
(
    SELECT o.val, rn = ROW_NUMBER() OVER (ORDER BY o.val), c.c
    FROM dbo.EvenRows AS o
    CROSS JOIN (SELECT c = COUNT(*) FROM dbo.EvenRows) AS c
) AS x
WHERE rn IN ((c + 1)/2, (c + 2)/2);

主人自己,Itzik Ben-Gan

答案 5 :(得分:7)

MS SQL Server 2012(及更高版本)具有PERCENTILE_DISC函数,该函数计算排序值的特定百分位数。 PERCENTILE_DISC(0.5)将计算中位数 - https://msdn.microsoft.com/en-us/library/hh231327.aspx

答案 6 :(得分:4)

简单,快速,准确

SELECT x.Amount 
FROM   (SELECT amount, 
               Count(1) OVER (partition BY 'A')        AS TotalRows, 
               Row_number() OVER (ORDER BY Amount ASC) AS AmountOrder 
        FROM   facttransaction ft) x 
WHERE  x.AmountOrder = Round(x.TotalRows / 2.0, 0)  

答案 7 :(得分:4)

如果要在SQL Server中使用“创建聚合”功能,则可以使用此方法。这样做具有能够编写干净查询的好处。请注意,此过程可以适用于相当容易地计算百分位值。

创建一个新的Visual Studio项目并将目标框架设置为.NET 3.5(这适用于SQL 2008,在SQL 2012中可能有所不同)。然后创建一个类文件并输入以下代码,或c#equivalent:

Imports Microsoft.SqlServer.Server
Imports System.Data.SqlTypes
Imports System.IO

<Serializable>
<SqlUserDefinedAggregate(Format.UserDefined, IsInvariantToNulls:=True, IsInvariantToDuplicates:=False, _
  IsInvariantToOrder:=True, MaxByteSize:=-1, IsNullIfEmpty:=True)>
Public Class Median
  Implements IBinarySerialize
  Private _items As List(Of Decimal)

  Public Sub Init()
    _items = New List(Of Decimal)()
  End Sub

  Public Sub Accumulate(value As SqlDecimal)
    If Not value.IsNull Then
      _items.Add(value.Value)
    End If
  End Sub

  Public Sub Merge(other As Median)
    If other._items IsNot Nothing Then
      _items.AddRange(other._items)
    End If
  End Sub

  Public Function Terminate() As SqlDecimal
    If _items.Count <> 0 Then
      Dim result As Decimal
      _items = _items.OrderBy(Function(i) i).ToList()
      If _items.Count Mod 2 = 0 Then
        result = ((_items((_items.Count / 2) - 1)) + (_items(_items.Count / 2))) / 2@
      Else
        result = _items((_items.Count - 1) / 2)
      End If

      Return New SqlDecimal(result)
    Else
      Return New SqlDecimal()
    End If
  End Function

  Public Sub Read(r As BinaryReader) Implements IBinarySerialize.Read
    'deserialize it from a string
    Dim list = r.ReadString()
    _items = New List(Of Decimal)

    For Each value In list.Split(","c)
      Dim number As Decimal
      If Decimal.TryParse(value, number) Then
        _items.Add(number)
      End If
    Next

  End Sub

  Public Sub Write(w As BinaryWriter) Implements IBinarySerialize.Write
    'serialize the list to a string
    Dim list = ""

    For Each item In _items
      If list <> "" Then
        list += ","
      End If      
      list += item.ToString()
    Next
    w.Write(list)
  End Sub
End Class

然后编译它并将DLL和PDB文件复制到SQL Server计算机并在SQL Server中运行以下命令:

CREATE ASSEMBLY CustomAggregate FROM '{path to your DLL}'
WITH PERMISSION_SET=SAFE;
GO

CREATE AGGREGATE Median(@value decimal(9, 3))
RETURNS decimal(9, 3) 
EXTERNAL NAME [CustomAggregate].[{namespace of your DLL}.Median];
GO

然后你可以编写一个查询来计算中位数,如下所示:     SELECT dbo.Median(Field)FROM Table

答案 8 :(得分:3)

我在寻找基于集合的中位数解决方案时遇到了这个页面。在看了一些解决方案之后,我想出了以下内容。希望是有帮助/有效的。

DECLARE @test TABLE(
    i int identity(1,1),
    id int,
    score float
)

INSERT INTO @test (id,score) VALUES (1,10)
INSERT INTO @test (id,score) VALUES (1,11)
INSERT INTO @test (id,score) VALUES (1,15)
INSERT INTO @test (id,score) VALUES (1,19)
INSERT INTO @test (id,score) VALUES (1,20)

INSERT INTO @test (id,score) VALUES (2,20)
INSERT INTO @test (id,score) VALUES (2,21)
INSERT INTO @test (id,score) VALUES (2,25)
INSERT INTO @test (id,score) VALUES (2,29)
INSERT INTO @test (id,score) VALUES (2,30)

INSERT INTO @test (id,score) VALUES (3,20)
INSERT INTO @test (id,score) VALUES (3,21)
INSERT INTO @test (id,score) VALUES (3,25)
INSERT INTO @test (id,score) VALUES (3,29)

DECLARE @counts TABLE(
    id int,
    cnt int
)

INSERT INTO @counts (
    id,
    cnt
)
SELECT
    id,
    COUNT(*)
FROM
    @test
GROUP BY
    id

SELECT
    drv.id,
    drv.start,
    AVG(t.score)
FROM
    (
        SELECT
            MIN(t.i)-1 AS start,
            t.id
        FROM
            @test t
        GROUP BY
            t.id
    ) drv
    INNER JOIN @test t ON drv.id = t.id
    INNER JOIN @counts c ON t.id = c.id
WHERE
    t.i = ((c.cnt+1)/2)+drv.start
    OR (
        t.i = (((c.cnt+1)%2) * ((c.cnt+2)/2))+drv.start
        AND ((c.cnt+1)%2) * ((c.cnt+2)/2) <> 0
    )
GROUP BY
    drv.id,
    drv.start

答案 9 :(得分:3)

虽然Justin grant的解决方案看起来很稳固,但我发现当给定分区键中有多个重复值时,ASC重复值的行号最终会不按顺序排列,因此它们无法正确对齐。

以下是我的结果中的一个片段:

KEY VALUE ROWA ROWD  

13  2     22   182
13  1     6    183
13  1     7    184
13  1     8    185
13  1     9    186
13  1     10   187
13  1     11   188
13  1     12   189
13  0     1    190
13  0     2    191
13  0     3    192
13  0     4    193
13  0     5    194

我使用Justin的代码作为此解决方案的基础。虽然在使用多个派生表时效率不高,但它确实解决了我遇到的行排序问题。任何改进都会受到欢迎,因为我不熟悉T-SQL。

SELECT PKEY, cast(AVG(VALUE)as decimal(5,2)) as MEDIANVALUE
FROM
(
  SELECT PKEY,VALUE,ROWA,ROWD,
  'FLAG' = (CASE WHEN ROWA IN (ROWD,ROWD-1,ROWD+1) THEN 1 ELSE 0 END)
  FROM
  (
    SELECT
    PKEY,
    cast(VALUE as decimal(5,2)) as VALUE,
    ROWA,
    ROW_NUMBER() OVER (PARTITION BY PKEY ORDER BY ROWA DESC) as ROWD 

    FROM
    (
      SELECT
      PKEY, 
      VALUE,
      ROW_NUMBER() OVER (PARTITION BY PKEY ORDER BY VALUE ASC,PKEY ASC ) as ROWA 
      FROM [MTEST]
    )T1
  )T2
)T3
WHERE FLAG = '1'
GROUP BY PKEY
ORDER BY PKEY

答案 10 :(得分:3)

以下查询从一列中的值列表中返回中位数。它不能用作聚合函数或与聚合函数一起使用,但您仍可以将其用作内部选择中带有WHERE子句的子查询。

SQL Server 2005 +:

SELECT TOP 1 value from
(
    SELECT TOP 50 PERCENT value 
    FROM table_name 
    ORDER BY  value
)for_median
ORDER BY value DESC

答案 11 :(得分:2)

在UDF中,写一下:

 Select Top 1 medianSortColumn from Table T
  Where (Select Count(*) from Table
         Where MedianSortColumn <
           (Select Count(*) From Table) / 2)
  Order By medianSortColumn

答案 12 :(得分:2)

贾斯汀的例子非常好。但应该非常明确地说明主要的关键需求。我已经看到了没有密钥的野外代码,结果很糟糕。

我对Percentile_Cont的抱怨是它不会从数据集中给出实际值。 达到&#34;中位数&#34;这是数据集使用Percentile_Disc的实际值。

SELECT SalesOrderID, OrderQty,
    PERCENTILE_DISC(0.5) 
        WITHIN GROUP (ORDER BY OrderQty)
        OVER (PARTITION BY SalesOrderID) AS MedianCont
FROM Sales.SalesOrderDetail
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY SalesOrderID DESC

答案 13 :(得分:1)

使用一条语句-一种方法是使用ROW_NUMBER()并使用子查询进行过滤。这里是找到工资中位数:

SELECT AVG(a.Salary) FROM                                                             
(SELECT ROW_NUMBER() OVER(ORDER BY Salary) as row_no, Salary FROM Employee)a
CROSS JOIN
(SELECT (COUNT(*)+1)*0.5 AS row_half FROM Employee )t
WHERE a.row_no IN (FLOOR(t.row_half),CEILING(t.row_half))

我已经在网上看到了使用FLOOR和CEILING的类似解决方案,但是尝试使用单个语句。

答案 14 :(得分:1)

对于连续变量/度量&#39; col1&#39;来自&#39; table1&#39;

select col1  
from
    (select top 50 percent col1, 
    ROW_NUMBER() OVER(ORDER BY col1 ASC) AS Rowa,
    ROW_NUMBER() OVER(ORDER BY col1 DESC) AS Rowd
    from table1 ) tmp
where tmp.Rowa = tmp.Rowd

答案 15 :(得分:1)

请参阅SQL中的中位数计算的其他解决方案: “Simple way to calculate median with MySQL”(解决方案主要与供应商无关)。

答案 16 :(得分:0)

使用COUNT个汇总, 您可以先计算有多少行并将其存储在名为@cnt的变量中。然后 您可以计算数量,以基于数量排序指定OFFSET-FETCH过滤器, 跳过多少行(偏移值)以及要过滤多少行(获取值)。

行数 跳过为(@cnt-1)/2。很明显,对于奇数计数,此计算是正确的,因为您 首先将1减去一个中间值,然后再除以2。

这对于偶数计数也是正确的,因为表达式中使用的除法是 整数除法因此,当您从偶数中减去1时,就剩下一个奇数值。

将该奇数除以2时,结果(.5)的小数部分将被截断。号码 要提取的行数为2-(@cnt%2)。这个想法是,当计数为奇数时, 模运算为1,您需要提取1行。当计数甚至是 模运算为0,您需要提取2行。通过减去1或0的结果 从2取模,您将分别获得所需的1或2。最后,计算 中位数,取一个或两个结果量,并在转换后应用平均值 将输入整数值转换为数字1,如下所示:

DECLARE @cnt AS INT = (SELECT COUNT(*) FROM [Sales].[production].[stocks]);
SELECT AVG(1.0 * quantity) AS median
FROM ( SELECT quantity
FROM [Sales].[production].[stocks]
ORDER BY quantity
OFFSET (@cnt - 1) / 2 ROWS FETCH NEXT 2 - @cnt % 2 ROWS ONLY ) AS D;

答案 17 :(得分:0)

从员工表中获取工资的中值

with cte as (select salary, ROW_NUMBER() over (order by salary asc) as num from employees)

select avg(salary) from cte where num in ((select (count(*)+1)/2 from employees), (select (count(*)+2)/2 from employees));

答案 18 :(得分:0)

以下是我的解决方法:

with tempa as

 (

    select value,row_number() over (order by value) as Rn,/* Assigning a 
                                                           row_number */
           count(value) over () as Cnt /*Taking total count of the values */
    from numbers
    where value is not null /* Excluding the null values */
 ),

tempb as

  (

    /* Since we don't know whether the number of rows is odd or even, we shall 
     consider both the scenarios */

    select round(cnt/2) as Ref from tempa where mod(cnt,2)=1
    union all
    select round(cnt/2) a Ref from tempa where mod(cnt,2)=0
     union all
    select round(cnt/2) + 1 as Ref from tempa where mod(cnt,2)=0
   )
  select avg(value) as Median_Value

  from tempa where rn in

    ( select Ref from tempb);

答案 19 :(得分:0)

中位数调查结果

这是查找属性中位数的最简单方法。

Select round(S.salary,4) median from employee S where (select count(salary) from station where salary < S.salary ) = (select count(salary) from station where salary > S.salary)

答案 20 :(得分:0)

这是找到我能想到的中位数的最佳解决方案。示例中的名称基于Justin示例。确保表的索引 Sales.SalesOrderHeader以该顺序存在索引列CustomerId和TotalDue。

SELECT
 sohCount.CustomerId,
 AVG(sohMid.TotalDue) as TotalDueMedian
FROM 
(SELECT 
  soh.CustomerId,
  COUNT(*) as NumberOfRows
FROM 
  Sales.SalesOrderHeader soh 
GROUP BY soh.CustomerId) As sohCount
CROSS APPLY 
    (Select 
       soh.TotalDue
    FROM 
    Sales.SalesOrderHeader soh 
    WHERE soh.CustomerId = sohCount.CustomerId 
    ORDER BY soh.TotalDue
    OFFSET sohCount.NumberOfRows / 2 - ((sohCount.NumberOfRows + 1) % 2) ROWS 
    FETCH NEXT 1 + ((sohCount.NumberOfRows + 1) % 2) ROWS ONLY
    ) As sohMid
GROUP BY sohCount.CustomerId

<强>更新

我有点不确定哪种方法具有最佳性能,所以我通过在一个批次中基于所有三种方法运行查询并且每个查询的批处理成本为:

没有索引:

  • 我的30%
  • Justin Grants 13%
  • Jeff Atwoods 58%

带索引

  • 我的3%。
  • Justin Grants 10%
  • 杰夫阿特伍德87%

我试图通过从大约14 000行创建更多数据2到512来获得索引来查看查询的扩展程度,这意味着最终大约有7,2百万行。注意我确保CustomeId字段在每次执行单个副本时都是唯一的,因此与CustomerId的唯一实例相比的行比例保持不变。当我这样做时,我运行了执行,之后我重建了索引,并且我注意到结果稳定在128左右,我对这些值有了数据:

  • 我的3%。
  • Justin Grants 5%
  • Jeff Atwoods 92%

我想知道如何通过缩放行数来保持性能,但保持唯一的CustomerId不变,所以我设置了一个新的测试,我做了这个。现在不是稳定,而是批量成本比率保持分歧,而不是每个CustomerId大约20行,每个平均值我最终大约10000行每个这样的唯一ID。数字在哪里:

  • 我的4%
  • Justins 60%
  • 杰夫斯35%

我确保通过比较结果来实现每种方法的正确性。 我的结论是,只要索引存在,我使用的方法通常更快。另请注意,此方法是本文https://www.microsoftpressstore.com/articles/article.aspx?p=2314819&seqNum=5

中针对此特定问题所推荐的方法

进一步提高对该查询的后续调用的性能的方法是将计数信息保存在辅助表中。您甚至可以通过更新触发器来维护它,并保存有关依赖于CustomerId的SalesOrderHeader行数的信息,当然您也可以简单地存储中位数。

答案 21 :(得分:0)

对于您的问题,Jeff Atwood已经提供了简单有效的解决方案。但是,如果您正在寻找一些计算中位数的替代方法,那么SQL代码将帮助您。

create table employees(salary int);

insert into employees values(8); insert into employees values(23); insert into employees values(45); insert into employees values(123); insert into employees values(93); insert into employees values(2342); insert into employees values(2238);

select * from employees;

declare @odd_even int; declare @cnt int; declare @middle_no int;


set @cnt=(select count(*) from employees); set @middle_no=(@cnt/2)+1; select @odd_even=case when (@cnt%2=0) THEN -1 ELse 0 END ;


 select AVG(tbl.salary) from  (select  salary,ROW_NUMBER() over (order by salary) as rno from employees group by salary) tbl  where tbl.rno=@middle_no or tbl.rno=@middle_no+@odd_even;

如果您想在MySQL中计算中位数,这个github link将非常有用。

答案 22 :(得分:0)

通常,我们可能需要不仅针对整个表计算Median,而且针对某些ID计算聚合。换句话说,计算表中每个ID的中位数,其中每个ID都有许多记录。 (基于@gdoron编辑的解决方案:良好的性能,适用于许多SQL)

SELECT our_id, AVG(1.0 * our_val) as Median
FROM
( SELECT our_id, our_val, 
  COUNT(*) OVER (PARTITION BY our_id) AS cnt,
  ROW_NUMBER() OVER (PARTITION BY our_id ORDER BY our_val) AS rnk
  FROM our_table
) AS x
WHERE rnk IN ((cnt + 1)/2, (cnt + 2)/2) GROUP BY our_id;

希望它有所帮助。

答案 23 :(得分:0)

基于杰夫阿特伍德上面的回答,它是使用GROUP BY和相关子查询来获得每个组的中位数。

for (int i = 0; i < sortedCells.Count; i++)
{
  command.Parameters.Clear();
  //your code to add parameters
}

答案 24 :(得分:0)

我尝试了几种替代方案,但由于我的数据记录有重复值,ROW_NUMBER版本似乎不适合我。所以这里我使用的查询(带有NTILE的版本):

SELECT distinct
   CustomerId,
   (
       MAX(CASE WHEN Percent50_Asc=1 THEN TotalDue END) OVER (PARTITION BY CustomerId)  +
       MIN(CASE WHEN Percent50_desc=1 THEN TotalDue END) OVER (PARTITION BY CustomerId) 
   )/2 MEDIAN
FROM
(
   SELECT
      CustomerId,
      TotalDue,
     NTILE(2) OVER (
         PARTITION BY CustomerId
         ORDER BY TotalDue ASC) AS Percent50_Asc,
     NTILE(2) OVER (
         PARTITION BY CustomerId
         ORDER BY TotalDue DESC) AS Percent50_desc
   FROM Sales.SalesOrderHeader SOH
) x
ORDER BY CustomerId;

答案 25 :(得分:0)

对于大型数据集,您可以尝试使用此GIST:

https://gist.github.com/chrisknoll/1b38761ce8c5016ec5b2

它通过聚合您在集合中找到的不同值(例如年龄,出生年份等)来工作,并使用SQL窗口函数来查找您在查询中指定的任何百分位数。

答案 26 :(得分:0)

DECLARE @Obs int
DECLARE @RowAsc table
(
ID      INT IDENTITY,
Observation  FLOAT
)
INSERT INTO @RowAsc
SELECT Observations FROM MyTable
ORDER BY 1 
SELECT @Obs=COUNT(*)/2 FROM @RowAsc
SELECT Observation AS Median FROM @RowAsc WHERE ID=@Obs

答案 27 :(得分:0)

以下解决方案适用于以下假设:

  • 没有重复值
  • 无NULL

代码:

IF OBJECT_ID('dbo.R', 'U') IS NOT NULL
  DROP TABLE dbo.R

CREATE TABLE R (
    A FLOAT NOT NULL);

INSERT INTO R VALUES (1);
INSERT INTO R VALUES (2);
INSERT INTO R VALUES (3);
INSERT INTO R VALUES (4);
INSERT INTO R VALUES (5);
INSERT INTO R VALUES (6);

-- Returns Median(R)
select SUM(A) / CAST(COUNT(A) AS FLOAT)
from R R1 
where ((select count(A) from R R2 where R1.A > R2.A) = 
      (select count(A) from R R2 where R1.A < R2.A)) OR
      ((select count(A) from R R2 where R1.A > R2.A) + 1 = 
      (select count(A) from R R2 where R1.A < R2.A)) OR
      ((select count(A) from R R2 where R1.A > R2.A) = 
      (select count(A) from R R2 where R1.A < R2.A) + 1) ; 

答案 28 :(得分:0)

这是我能想到的答案。很好地处理了我的数据。如果要排除某些值,只需在内部选择中添加where子句。

SELECT TOP 1 
    ValueField AS MedianValue
FROM
    (SELECT TOP(SELECT COUNT(1)/2 FROM tTABLE)
        ValueField
    FROM 
        tTABLE
    ORDER BY 
        ValueField) A
ORDER BY
    ValueField DESC

答案 29 :(得分:0)

对于像我这样正在学习基础知识的新手,我个人觉得这个例子更容易理解,因为更容易理解究竟发生了什么以及中值来自哪里...

select
 ( max(a.[Value1]) + min(a.[Value1]) ) / 2 as [Median Value1]
,( max(a.[Value2]) + min(a.[Value2]) ) / 2 as [Median Value2]

from (select
    datediff(dd,startdate,enddate) as [Value1]
    ,xxxxxxxxxxxxxx as [Value2]
     from dbo.table1
     )a

虽然对上面的一些代码绝对敬畏!!!

答案 30 :(得分:0)

这适用于SQL 2000:

DECLARE @testTable TABLE 
( 
    VALUE   INT
)
--INSERT INTO @testTable -- Even Test
--SELECT 3 UNION ALL
--SELECT 5 UNION ALL
--SELECT 7 UNION ALL
--SELECT 12 UNION ALL
--SELECT 13 UNION ALL
--SELECT 14 UNION ALL
--SELECT 21 UNION ALL
--SELECT 23 UNION ALL
--SELECT 23 UNION ALL
--SELECT 23 UNION ALL
--SELECT 23 UNION ALL
--SELECT 29 UNION ALL
--SELECT 40 UNION ALL
--SELECT 56

--
--INSERT INTO @testTable -- Odd Test
--SELECT 3 UNION ALL
--SELECT 5 UNION ALL
--SELECT 7 UNION ALL
--SELECT 12 UNION ALL
--SELECT 13 UNION ALL
--SELECT 14 UNION ALL
--SELECT 21 UNION ALL
--SELECT 23 UNION ALL
--SELECT 23 UNION ALL
--SELECT 23 UNION ALL
--SELECT 23 UNION ALL
--SELECT 29 UNION ALL
--SELECT 39 UNION ALL
--SELECT 40 UNION ALL
--SELECT 56


DECLARE @RowAsc TABLE
(
    ID      INT IDENTITY,
    Amount  INT
)

INSERT INTO @RowAsc
SELECT  VALUE 
FROM    @testTable 
ORDER BY VALUE ASC

SELECT  AVG(amount)
FROM @RowAsc ra
WHERE ra.id IN
(
    SELECT  ID 
    FROM    @RowAsc
    WHERE   ra.id -
    (
        SELECT  MAX(id) / 2.0 
        FROM    @RowAsc
    ) BETWEEN 0 AND 1

)

答案 31 :(得分:0)

--Create Temp Table to Store Results in
DECLARE @results AS TABLE 
(
    [Month] datetime not null
 ,[Median] int not null
);

--This variable will determine the date
DECLARE @IntDate as int 
set @IntDate = -13


WHILE (@IntDate < 0) 
BEGIN

--Create Temp Table
DECLARE @table AS TABLE 
(
    [Rank] int not null
 ,[Days Open] int not null
);

--Insert records into Temp Table
insert into @table 

SELECT 
    rank() OVER (ORDER BY DATEADD(mm, DATEDIFF(mm, 0, DATEADD(ss, SVR.close_date, '1970')), 0), DATEDIFF(day,DATEADD(ss, SVR.open_date, '1970'),DATEADD(ss, SVR.close_date, '1970')),[SVR].[ref_num]) as [Rank]
 ,DATEDIFF(day,DATEADD(ss, SVR.open_date, '1970'),DATEADD(ss, SVR.close_date, '1970')) as [Days Open]
FROM
 mdbrpt.dbo.View_Request SVR
 LEFT OUTER JOIN dbo.dtv_apps_systems vapp 
 on SVR.category = vapp.persid
 LEFT OUTER JOIN dbo.prob_ctg pctg 
 on SVR.category = pctg.persid
 Left Outer Join [mdbrpt].[dbo].[rootcause] as [Root Cause] 
 on [SVR].[rootcause]=[Root Cause].[id]
 Left Outer Join [mdbrpt].[dbo].[cr_stat] as [Status]
 on [SVR].[status]=[Status].[code]
 LEFT OUTER JOIN [mdbrpt].[dbo].[net_res] as [net] 
 on [net].[id]=SVR.[affected_rc]
WHERE
 SVR.Type IN ('P') 
 AND
 SVR.close_date IS NOT NULL 
 AND
 [Status].[SYM] = 'Closed'
 AND
 SVR.parent is null
 AND
 [Root Cause].[sym] in ( 'RC - Application','RC - Hardware', 'RC - Operational', 'RC - Unknown')
 AND
 (
  [vapp].[appl_name] in ('3PI','Billing Rpts/Files','Collabrent','Reports','STMS','STMS 2','Telco','Comergent','OOM','C3-BAU','C3-DD','DIRECTV','DIRECTV Sales','DIRECTV Self Care','Dealer Website','EI Servlet','Enterprise Integration','ET','ICAN','ODS','SB-SCM','SeeBeyond','Digital Dashboard','IVR','OMS','Order Services','Retail Services','OSCAR','SAP','CTI','RIO','RIO Call Center','RIO Field Services','FSS-RIO3','TAOS','TCS')
 OR
  pctg.sym in ('Systems.Release Health Dashboard.Problem','DTV QA Test.Enterprise Release.Deferred Defect Log')
 AND  
  [Net].[nr_desc] in ('3PI','Billing Rpts/Files','Collabrent','Reports','STMS','STMS 2','Telco','Comergent','OOM','C3-BAU','C3-DD','DIRECTV','DIRECTV Sales','DIRECTV Self Care','Dealer Website','EI Servlet','Enterprise Integration','ET','ICAN','ODS','SB-SCM','SeeBeyond','Digital Dashboard','IVR','OMS','Order Services','Retail Services','OSCAR','SAP','CTI','RIO','RIO Call Center','RIO Field Services','FSS-RIO3','TAOS','TCS')
 )
 AND
 DATEADD(mm, DATEDIFF(mm, 0, DATEADD(ss, SVR.close_date, '1970')), 0) = DATEADD(mm, DATEDIFF(mm,0,DATEADD(mm,@IntDate,getdate())), 0)
ORDER BY [Days Open]



DECLARE @Count AS INT
SELECT @Count = COUNT(*) FROM @table;

WITH MyResults(RowNo, [Days Open]) AS
(
    SELECT RowNo, [Days Open] FROM
        (SELECT ROW_NUMBER() OVER (ORDER BY [Days Open]) AS RowNo, [Days Open] FROM @table) AS Foo
)


insert into @results
SELECT 
 DATEADD(mm, DATEDIFF(mm,0,DATEADD(mm,@IntDate,getdate())), 0) as [Month]
 ,AVG([Days Open])as [Median] FROM MyResults WHERE RowNo = (@Count+1)/2 OR RowNo = ((@Count+1)%2) * ((@Count+2)/2) 


set @IntDate = @IntDate+1
DELETE FROM @table
END

select *
from @results
order by [Month]

答案 32 :(得分:0)

我想自己制定一个解决方案,但是我的大脑绊倒并摔倒了。我认为它有效,但不要让我在早上解释它。 :P

DECLARE @table AS TABLE
(
    Number int not null
);

insert into @table select 2;
insert into @table select 4;
insert into @table select 9;
insert into @table select 15;
insert into @table select 22;
insert into @table select 26;
insert into @table select 37;
insert into @table select 49;

DECLARE @Count AS INT
SELECT @Count = COUNT(*) FROM @table;

WITH MyResults(RowNo, Number) AS
(
    SELECT RowNo, Number FROM
        (SELECT ROW_NUMBER() OVER (ORDER BY Number) AS RowNo, Number FROM @table) AS Foo
)
SELECT AVG(Number) FROM MyResults WHERE RowNo = (@Count+1)/2 OR RowNo = ((@Count+1)%2) * ((@Count+2)/2)

答案 33 :(得分:-2)

尝试以下逻辑找出中位数:

考虑一个包含以下数字的表格: 1,1,2,3,4,5

中位数为2.5

拍子为 ( 选择num,count(num)over()作为Cnt, row_number()超过(按编号排序)为Rnum 从温度), tempb as ( 选择round(cnt / 2)作为ref_value 从tempa那里mod(cnt,2)<> 0 全部合并 从tempa中选择round(cnt / 2),其中mod(cnt,2)= 0 全部合并 选择回合(cnt / 2 + 1) 从tempa那里mod(cnt,2)= 0 ) 从速度中选择avg(num) 其中rnum(在tempb中选择*);