确定直方图箱尺寸

时间:2010-08-25 13:56:52

标签: sql sql-server sql-server-2005 tsql

我正在寻找在SQL中创建一个直方图(这本身并不太棘手),但我正在寻找的是一种拆分垃圾箱的方法,以便每个垃圾箱/垃圾桶具有相同的比例。数据包含在内。

例如,如果我有样本数据(值列)并且我想将其分成5个分箱,我知道我可以通过执行类似

的操作来计算分箱数量
(MAX(Value) - MIN(Value)) / numberofsteps

将给出我们在第1栏中看到的群组。

然而我想要的是计算波段,使每个波段占总数的(100 / n)%,其中n是波段数(因此,在这种情况下,5个波段中的每一个代表20%总数据) - 这是第2栏

中显示的内容
Value      band 1     band 2
1     | 1 to 2    | 0 to 1
1     | 1 to 2    | 0 to 1
1     | 1 to 2    | 0 to 1
1     | 1 to 2    | 0 to 1
2     | 1 to 2    | 2 to 3
2     | 1 to 2    | 2 to 3
3     | 1 to 2    | 2 to 3
3     | 1 to 2    | 2 to 3
4     | 3 to 4    | 4 to 6
4     | 3 to 4    | 4 to 6
5     | 5 to 6    | 4 to 6
6     | 5 to 6    | 4 to 6
7     | 7 to 8    | 7 to 8
8     | 7 to 8    | 7 to 8
8     | 7 to 8    | 7 to 8
8     | 7 to 8    | 7 to 8
9     | 9 to 10   | 9 to 10
10    | 9 to 10   | 9 to 10
10    | 9 to 10   | 9 to 10
10  |    9 to 10   | 9 to 10

有没有办法在SQL中执行此操作(我使用的是SQL Server 2005,如果有帮助的话),可能没有创建UDF并且拥有它以便我可以轻松地改变bin的数量会很棒(如果不是这样的话)问不可能!)

由于

1 个答案:

答案 0 :(得分:4)

要分成垃圾箱,您可以使用ntile功能。

with Vals AS
(
SELECT 1 AS value UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 8 UNION ALL SELECT 8 UNION ALL SELECT 9 UNION ALL SELECT 10 UNION ALL SELECT 10 UNION ALL SELECT 10
), TiledVals AS
(
SELECT value, NTILE(5) OVER (ORDER BY value) AS BinNumber
FROM Vals
)
SELECT value, BinNumber, 
Min(value) OVER (PARTITION BY BinNumber) As StartBin,
 MAX(value) OVER (PARTITION BY BinNumber) As EndBin
FROM TiledVals

给出

value       BinNumber            StartBin    EndBin
----------- -------------------- ----------- -----------
1           1                    1           1
1           1                    1           1
1           1                    1           1
1           1                    1           1
2           2                    2           3
2           2                    2           3
3           2                    2           3
3           2                    2           3
4           3                    4           6
4           3                    4           6
5           3                    4           6
6           3                    4           6
7           4                    7           8
8           4                    7           8
8           4                    7           8
8           4                    7           8
9           5                    9           10
10          5                    9           10
10          5                    9           10
10          5                    9           10