SQL Server - 分组依据,平均值和百分位数

时间:2016-06-17 09:49:32

标签: sql sql-server group-by average

我在SQL Server中有一个FormSummaries表,其中包含以下相关的示例数据列:

FormName | CompletionTime
Form1    | 70
Form1    | 20
Form1    | 30
Form1    | 40
Form1    | 80
Form1    | 60
Form1    | 90
Form1    | 10
Form2    | 30
Form2    | 40
Form2    | 80
Form2    | 90
Form2    | 40
Form2    | 1000
Form2    | 120
Form2    | 70

我需要做的是:

1)通过表单的名称和该表单的完成时间的平均值对数据进行分组,这很容易:

SELECT 
    FormName, AVG(CompletionTime) 
FROM 
    FormSummaries 
WHERE 
    CompletionTime  is not null
GROUP BY
    FormName

2)获得每种表单类型的完成时间的前25%/最低25%的平均值(即完成每个表单所花费的平均最快和最慢25%)。理想情况下,这将是一个查询,即

FormName | Bottom25%AverageCompletionTime | Top25%AverageCompletionTime
Form1    | 85                             | 15
Form2    | 560                            | 35

我生活在现实世界,并意识到这可能是不可能的,因此对于顶部和底部的单独查询将是正常的,即

FormName | Bottom25%AverageCompletionTime
Form1    | 85                            
Form2    | 560                           

FormName | Top25%AverageCompletionTime
Form1    | 15
Form2   | 35

我看过Partition by,Ntile and Over但是我似乎无法获得任何产生预期结果的东西(尽管这可能是因为我没有实现这些正确!)。

有人可以帮忙吗?

非常感谢。

2 个答案:

答案 0 :(得分:1)

NTILE以块的形式对结果进行排名,因此您对四分之一感兴趣,因此请使用NTILE(4)分成4组,并对formname进行分区。要使用2个查询执行此操作,请尝试

-- top 25%
SELECT  formname, AVG(CompletionTime) 
FROM
(SELECT 
    FormName,completiontime, NTILE(4) over (partition by FormName order by completiontime) as QuartPercentile
FROM 
    FormSummaries
WHERE CompletionTime IS NOT NULL )
    x
WHERE  QuartPercentile = 1
GROUP BY formname

-- bottom 25%
SELECT  formname, AVG(CompletionTime) 
FROM
(SELECT 
    FormName,completiontime, NTILE(4) over (partition by FormName order by completiontime) as QuartPercentile
FROM 
    FormSummaries 
WHERE CompletionTime IS NOT NULL)
    x
WHERE  QuartPercentile = 4
GROUP BY formname

或使用一个查询

SELECT  formname,AVG( case when QuartPercentile = 4 then CompletionTime else null end)   as [Bottom25%AverageCompletionTime]
, AVG( case when QuartPercentile = 1 then CompletionTime else null end)   as [Top25%AverageCompletionTime]
FROM
(SELECT 
    FormName,completiontime, NTILE(4) over (partition by FormName order by completiontime) as QuartPercentile
FROM 
    FormSummaries 
WHERE CompletionTime IS NOT NULL)
    x

GROUP BY formname

请记住,如果您的completiontime列有整数,AVG将返回一个整数,因此您可能希望转换以获得所需的精度,例如

AVG( case when QuartPercentile = 1 then cast(CompletionTime AS decimal(9,2))  else null end) 

答案 1 :(得分:0)

您可以使用CTE + PIVOT:

<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="no" indent="yes"/>
    <xsl:strip-space elements="*"/>  

    <xsl:template match="node()|@*"> 
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="IconComponent[@Id='10001']//ColorId/text()[.='210']">21</xsl:template>

</xsl:stylesheet>

输出:

;WITH PercentCount AS (
SELECT  FormName,
        COUNT(*)/4 as [Bottom25Percent],
        COUNT(*) as [Top25Percent]
FROM Forms
GROUP BY FormName
), FormsWithRowNumber AS (
SELECT  f.FormName,
        f.CompletionTime,
        ROW_NUMBER() OVER (PARTITION BY f.FormName ORDER BY f.CompletionTime) as rn
FROM Forms f
), final AS (
SELECT  f.FormName, 
        f.CompletionTime,       
        CASE WHEN f.rn between 1 and [Bottom25Percent] THEN 1 
             WHEN f.rn between [Top25Percent]-[Bottom25Percent]+1 and [Top25Percent] THEN 2
             ELSE 0 END as [TopBottom]
FROM FormsWithRowNumber f
INNER JOIN PercentCount p
    ON p.FormName = f.FormName
)

SELECT *
FROM final
PIVOT (
    AVG(CompletionTime) FOR TopBottom IN ([1],[2])
) as pvt