我有一个大表(现在大约有8.5亿行),并且需要在插入新数据时每周计算百分位值,静态变得很脏。但是,该过程非常慢(使用我的硬件/当前查询5-6小时)。
如何更改查询以加快查询速度?
现在,我的查询基本上是这样的:
SELECT DISTINCT [ident1]
,[ident2]
,[ident3]
,[ident4]
,percentile_cont(0.05)
WITHIN GROUP (
ORDER BY [value] ASC
) OVER (
PARTITION BY [ident1]
,[ident2]
,[ident3]
,[ident4]
) AS [percentile_5]
,percentile_cont(0.10)
WITHIN GROUP (
ORDER BY [value] ASC
) OVER (
PARTITION BY [ident1]
,[ident2]
,[ident3]
,[ident4]
) AS [percentile_10]
,percentile_cont(0.25)
WITHIN GROUP (
ORDER BY [value] ASC
) OVER (
PARTITION BY [ident1]
,[ident2]
,[ident3]
,[ident4]
) AS [percentile_25]
,percentile_cont(0.50)
WITHIN GROUP (
ORDER BY [value] ASC
) OVER (
PARTITION BY [ident1]
,[ident2]
,[ident3]
,[ident4]
) AS [percentile_50]
,percentile_cont(0.75)
WITHIN GROUP (
ORDER BY [value] ASC
) OVER (
PARTITION BY [ident1]
,[ident2]
,[ident3]
,[ident4]
) AS [percentile_75]
,percentile_cont(0.90)
WITHIN GROUP (
ORDER BY [value] ASC
) OVER (
PARTITION BY [ident1]
,[ident2]
,[ident3]
,[ident4]
) AS [percentile_90]
,percentile_cont(0.95)
WITHIN GROUP (
ORDER BY [value] ASC
) OVER (
PARTITION BY [ident1]
,[ident2]
,[ident3]
,[ident4]
) AS [percentile_95]
FROM dataTable
我认为问题的一部分是在没有DISTINCT
选择的情况下,我在DB中的每个值都得到一行。 SQL是否足够聪明,每组只计算一次百分位数?或者是否为每个值重复计算?
非常感谢任何帮助。
答案 0 :(得分:0)
将distinct放在子查询中:
select . . .
from (select distinct . . . ) s;
在计算完所有distinct
列后, select
应该发生。