我正在寻找在SQL中创建一个直方图(这本身并不太棘手),但我正在寻找的是一种拆分垃圾箱的方法,以便每个垃圾箱/垃圾桶具有相同的比例。数据包含在内。
例如,如果我有样本数据(值列)并且我想将其分成5个分箱,我知道我可以通过执行类似
的操作来计算分箱数量(MAX(Value) - MIN(Value)) / numberofsteps
将给出我们在第1栏中看到的群组。
然而我想要的是计算波段,使每个波段占总数的(100 / n)%,其中n是波段数(因此,在这种情况下,5个波段中的每一个代表20%总数据) - 这是第2栏
中显示的内容Value band 1 band 2
1 | 1 to 2 | 0 to 1
1 | 1 to 2 | 0 to 1
1 | 1 to 2 | 0 to 1
1 | 1 to 2 | 0 to 1
2 | 1 to 2 | 2 to 3
2 | 1 to 2 | 2 to 3
3 | 1 to 2 | 2 to 3
3 | 1 to 2 | 2 to 3
4 | 3 to 4 | 4 to 6
4 | 3 to 4 | 4 to 6
5 | 5 to 6 | 4 to 6
6 | 5 to 6 | 4 to 6
7 | 7 to 8 | 7 to 8
8 | 7 to 8 | 7 to 8
8 | 7 to 8 | 7 to 8
8 | 7 to 8 | 7 to 8
9 | 9 to 10 | 9 to 10
10 | 9 to 10 | 9 to 10
10 | 9 to 10 | 9 to 10
10 | 9 to 10 | 9 to 10
有没有办法在SQL中执行此操作(我使用的是SQL Server 2005,如果有帮助的话),可能没有创建UDF并且拥有它以便我可以轻松地改变bin的数量会很棒(如果不是这样的话)问不可能!)
由于
答案 0 :(得分:4)
要分成垃圾箱,您可以使用ntile功能。
with Vals AS
(
SELECT 1 AS value UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 8 UNION ALL SELECT 8 UNION ALL SELECT 9 UNION ALL SELECT 10 UNION ALL SELECT 10 UNION ALL SELECT 10
), TiledVals AS
(
SELECT value, NTILE(5) OVER (ORDER BY value) AS BinNumber
FROM Vals
)
SELECT value, BinNumber,
Min(value) OVER (PARTITION BY BinNumber) As StartBin,
MAX(value) OVER (PARTITION BY BinNumber) As EndBin
FROM TiledVals
给出
value BinNumber StartBin EndBin
----------- -------------------- ----------- -----------
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
2 2 2 3
2 2 2 3
3 2 2 3
3 2 2 3
4 3 4 6
4 3 4 6
5 3 4 6
6 3 4 6
7 4 7 8
8 4 7 8
8 4 7 8
8 4 7 8
9 5 9 10
10 5 9 10
10 5 9 10
10 5 9 10