如何在查询中执行统计计算?

时间:2012-12-11 08:23:42

标签: sql-server distribution statistics

我有一个填充浮点值的表。我需要计算按平均值(Gaussian Distribution)分布的分组结果数。基本上,它计算如下:

SELECT COUNT(*), FloatColumn - AVG(FloatColumn) - STDEV(FloatColumn) 
FROM Data 
GROUP BY FloatColumn - AVG(FloatColumn) - STDEV(FloatColumn)

但由于显而易见的原因,SQL Server会出现此错误:Cannot use an aggregate or a subquery in an expression used for the group by list of a GROUP BY clause.

我的问题是,我能以某种方式将此计算留给SQL Server吗?或者我必须以老式的方式做到这一点?检索所有数据,并自己进行计算?

2 个答案:

答案 0 :(得分:2)

要获取整个集合的集合,您可以使用空OVER子句

WITH T(Result)
     AS (SELECT FloatColumn - Avg(FloatColumn) OVER() - Stdev(FloatColumn) OVER ()
         FROM   Data)
SELECT Count(*),
       Result
FROM   T
GROUP  BY Result 

答案 1 :(得分:1)

SQL Fiddle

您可以执行数据的预聚合,然后重新加入表格。

架构设置

create table data(floatcolumn float);
insert data values
  (1234.56),
  (134.56),
  (134.56),
  (234.56),
  (1349),
  (900);

查询1

SELECT COUNT(*) C, D.FloatColumn - A
  FROM
    (
    SELECT AVG(FloatColumn) + STDEV(FloatColumn) A
        FROM Data
    ) preagg
CROSS JOIN Data D
GROUP BY FloatColumn - A;

<强> Results

| C |           COLUMN_1 |
--------------------------
| 2 | -1196.876067819572 |
| 1 | -1096.876067819572 |
| 1 |  -431.436067819572 |
| 1 |   -96.876067819572 |
| 1 |    17.563932180428 |