SQL百分位计算运行速度非常慢 - 需要帮助加快速度

时间:2017-09-17 19:21:34

标签: sql sql-server performance database-performance

我有一个大表(现在大约有8.5亿行),并且需要在插入新数据时每周计算百分位值,静态变得很脏。但是,该过程非常慢(使用我的硬件/当前查询5-6小时)。

如何更改查询以加快查询速度?

现在,我的查询基本上是这样的:

SELECT DISTINCT [ident1]
    ,[ident2]
    ,[ident3]
    ,[ident4]
    ,percentile_cont(0.05)
        WITHIN GROUP (
            ORDER BY [value] ASC
        ) OVER (
            PARTITION BY [ident1]
                ,[ident2]
                ,[ident3]
                ,[ident4]
        ) AS [percentile_5]
    ,percentile_cont(0.10)
        WITHIN GROUP (
            ORDER BY [value] ASC
        ) OVER (
            PARTITION BY [ident1]
                ,[ident2]
                ,[ident3]
                ,[ident4]
        ) AS [percentile_10]
    ,percentile_cont(0.25)
        WITHIN GROUP (
            ORDER BY [value] ASC
        ) OVER (
            PARTITION BY [ident1]
                ,[ident2]
                ,[ident3]
                ,[ident4]
        ) AS [percentile_25]
    ,percentile_cont(0.50)
        WITHIN GROUP (
            ORDER BY [value] ASC
        ) OVER (
            PARTITION BY [ident1]
                ,[ident2]
                ,[ident3]
                ,[ident4]
        ) AS [percentile_50]
    ,percentile_cont(0.75)
        WITHIN GROUP (
            ORDER BY [value] ASC
        ) OVER (
            PARTITION BY [ident1]
                ,[ident2]
                ,[ident3]
                ,[ident4]
        ) AS [percentile_75]
    ,percentile_cont(0.90)
        WITHIN GROUP (
            ORDER BY [value] ASC
        ) OVER (
            PARTITION BY [ident1]
                ,[ident2]
                ,[ident3]
                ,[ident4]
        ) AS [percentile_90]
    ,percentile_cont(0.95)
        WITHIN GROUP (
            ORDER BY [value] ASC
        ) OVER (
            PARTITION BY [ident1]
                ,[ident2]
                ,[ident3]
                ,[ident4]
        ) AS [percentile_95]
FROM dataTable

认为问题的一部分是在没有DISTINCT选择的情况下,我在DB中的每个值都得到一行。 SQL是否足够聪明,每组只计算一次百分位数?或者是否为每个值重复计算?

非常感谢任何帮助。

1 个答案:

答案 0 :(得分:0)

将distinct放在子查询中:

select . . .
from (select distinct . . . ) s;
在计算完所有distinct列后, select应该发生