从数据库中的列值生成直方图

时间:2009-01-27 21:41:38

标签: sql sql-server histogram

假设我有一个数据库列'等级',如下所示:

|grade|
|    1|
|    2|
|    1|
|    3|
|    4|
|    5|

在SQL中生成像这样的直方图有什么不平凡的方法吗?

|2,1,1,1,1,0|

其中2表示1级出现两次,1表示平均成绩{2..5}出现一次,0表示6级完全没出现。

我不介意直方图是否每个计数一行。

如果重要,数据库是由perl CGI通过unixODBC / FreeTDS访问的SQL Server。

编辑:感谢您的快速回复!如果我可以确定哪个直方图值属于哪个等级,那么不存在不存在的值(如上例中的等级6)是可以的。

7 个答案:

答案 0 :(得分:29)

SELECT COUNT(grade) FROM table GROUP BY grade ORDER BY grade

没有验证它,但是它应该可以工作。但是,它不会显示6s等级,因为它根本没有出现在表中......

答案 1 :(得分:6)

使用临时表来获取缺失的值:

CREATE TABLE #tmp(num int)
DECLARE @num int
SET @num = 0
WHILE @num < 10
BEGIN
  INSERT #tmp @num
  SET @num = @num + 1
END


SELECT t.num as [Grade], count(g.Grade) FROM gradeTable g
RIGHT JOIN #tmp t on g.Grade = t.num
GROUP by t.num
ORDER BY 1

答案 2 :(得分:4)

如果有很多数据点,您也可以group ranges together这样:

SELECT FLOOR(grade/5.00)*5 As Grade, 
       COUNT(*) AS [Grade Count]
FROM TableName
GROUP BY FLOOR(Grade/5.00)*5
ORDER BY 1

此外,如果您想标记整个范围,您可以提前通过CTE获得地板和天花板。

With GradeRanges As (
  SELECT FLOOR(Score/5.00)*5     As GradeFloor, 
         FLOOR(Score/5.00)*5 + 4 As GradeCeiling
  FROM TableName
)
SELECT GradeFloor,
       CONCAT(GradeFloor, ' to ', GradeCeiling) AS GradeRange,
       COUNT(*) AS [Grade Count]
FROM GradeRanges
GROUP BY GradeFloor, CONCAT(GradeFloor, ' to ', GradeCeiling)
ORDER BY GradeFloor

注意:在某些SQL引擎中,您可以GROUP BY一个有序列索引,但对于MS SQL,如果您想在SELECT语句中使用它,那么您就是还需要按照它进行分组,因此也将Range复制到Group Expression中。

选项2 :您可以使用case statements to selectively count values into arbitrary bins and then unpivot them获取所包含值的行数

答案 3 :(得分:3)

Gamecat对DISTINCT的使用对我来说似乎有些奇怪,当我回到办公室时将不得不尝试...

我这样做的方式虽然相似......

SELECT
    [table].grade        AS [grade],
    COUNT(*)             AS [occurances]
FROM
    [table]
GROUP BY
    [table].grade
ORDER BY
    [table].grade

为了克服缺少有0次出现的数据,您可以LEFT JOIN到包含所有有效等级的表格。 COUNT(*)将计算NULLS,但COUNT(等级)将不计入NULLS。

DECLARE @grades TABLE (
   val INT
   )  

INSERT INTO @grades VALUES (1)  
INSERT INTO @grades VALUES (2)  
INSERT INTO @grades VALUES (3)  
INSERT INTO @grades VALUES (4)  
INSERT INTO @grades VALUES (5)  
INSERT INTO @grades VALUES (6)  

SELECT
    [grades].val         AS [grade],
    COUNT([table].grade) AS [occurances]
FROM
    @grades   AS [grades]
LEFT JOIN
    [table]
        ON [table].grade = [grades].val
GROUP BY
    [grades].val
ORDER BY
    [grades].val

答案 4 :(得分:2)

select Grade, count(Grade)
from MyTable
group by Grade

答案 5 :(得分:2)

根据Shlomo Priymak的文章How to Quickly Create a Histogram in MySQL,您可以使用以下查询:

SELECT grade, 
       COUNT(\*) AS 'Count',
       RPAD('', COUNT(\*), '*') AS 'Bar' 
FROM grades 
GROUP BY grade

将产生下表:

grade   Count   Bar
1       2       **
2       1       *
3       1       *
4       1       *
5       1       *

答案 6 :(得分:0)

我正在建立Ilya Volodin上面所做的事情,这应该允许你选择你想要在你的结果中组合在一起的成绩范围:

DECLARE @cnt INT = 0;

WHILE @cnt < 100 -- Set max value
BEGIN
SELECT @cnt,COUNT(fe) FROM dbo.GEODATA_CB where fe >= @cnt-0.999 and fe <= @cnt+0.999 -- set tolerance
SET @cnt = @cnt + 1; -- set step
END;