tsql对范围内的连续数字进行分组

时间:2012-04-06 08:23:05

标签: sql-server tsql count group-by

有没有办法将这些温度测量分组到连续组? 我想得到小组,时差和数在0-7到8-12之间,超过12

      Date         Heat 
01/01/2012 12:00    8
01/01/2012 12:03    9
01/01/2012 12:06    5
01/01/2012 12:09    3
01/01/2012 12:12    6
01/01/2012 12:15    7
01/01/2012 12:18    1
01/01/2012 12:21    12
01/01/2012 12:24    28
01/01/2012 12:27    25
01/01/2012 12:30    20
01/01/2012 12:33    20
01/01/2012 12:36    20
01/01/2012 12:39    12
01/01/2012 12:42    6
01/01/2012 12:45    3
01/01/2012 12:48    5
01/01/2012 12:51    7
01/01/2012 12:54    11
01/01/2012 12:57    12
01/01/2012 13:00    6

结果应为:

0-7   (01/01/2012 12:06-01/01/2012 12:18)   5
/* Rows of dataset:
01/01/2012 12:06    5
01/01/2012 12:09    3
01/01/2012 12:12    6
01/01/2012 12:15    7
01/01/2012 12:18    1    
*/
0-7   (01/01/2012 12:42-01/01/2012 12:51)   5
/* Rows of dataset:
01/01/2012 12:42    6
01/01/2012 12:45    3
01/01/2012 12:48    5
01/01/2012 12:51    7  
*/
8-12   (01/01/2012 12:00-01/01/2012 12:03)   2
/* Rows of dataset:
01/01/2012 12:00    8
01/01/2012 12:03    9
*/
more then 12   (01/01/2012 12:24-01/01/2012 12:36)   5
/* Rows of dataset:
01/01/2012 12:24    28
01/01/2012 12:27    25
01/01/2012 12:30    20
01/01/2012 12:33    20
01/01/2012 12:36    20
*/

8-12   (01/01/2012 12:21)   1
/* Rows of dataset:
01/01/2012 12:21    12     */

3 个答案:

答案 0 :(得分:4)

注意:由于RANK/DENSE_RANK的处理顺序为PARTITION BY,然后是ORDER BY,因此在这种情况下这些功能无效。也许,在某个时间点,MS将引入补充语法: [DENSE_]RANK() OVER(ORDER BY fields PARTITION BY fields) {}首先处理ORDER BY,然后PARTITION BY

1)第一个解决方案(SQL2005 +)

    DECLARE @TestData TABLE
    (
        Dt      SMALLDATETIME PRIMARY KEY,
        Heat    TINYINT NOT NULL
    );

    INSERT  @TestData(Dt, Heat)
    VALUES
            SELECT '2012-01-01T12:00:00', 8  UNION ALL SELECT '2012-01-01T12:03:00', 9  UNION ALL SELECT '2012-01-01T12:06:00', 5
UNION ALL   SELECT '2012-01-01T12:09:00', 3  UNION ALL SELECT '2012-01-01T12:12:00', 6  UNION ALL SELECT '2012-01-01T12:15:00', 7
UNION ALL   SELECT '2012-01-01T12:18:00', 1  UNION ALL SELECT '2012-01-01T12:21:00', 12 UNION ALL SELECT '2012-01-01T12:24:00', 28
UNION ALL   SELECT '2012-01-01T12:27:00', 25 UNION ALL SELECT '2012-01-01T12:30:00', 20 UNION ALL SELECT '2012-01-01T12:33:00', 20
UNION ALL   SELECT '2012-01-01T12:36:00', 20 UNION ALL SELECT '2012-01-01T12:39:00', 12 UNION ALL SELECT '2012-01-01T12:42:00', 6
UNION ALL   SELECT '2012-01-01T12:45:00', 3  UNION ALL SELECT '2012-01-01T12:48:00', 5  UNION ALL SELECT '2012-01-01T12:51:00', 7
UNION ALL   SELECT '2012-01-01T12:54:00', 11 UNION ALL SELECT '2012-01-01T12:57:00', 12 UNION ALL SELECT '2012-01-01 13:00:00', 6;

    SET STATISTICS IO ON;

    WITH CteSource
    AS
    (
            SELECT  a.*,
                    CASE 
                        WHEN a.Heat >= 0 AND a.Heat <= 7 THEN 1
                        WHEN a.Heat >= 8 AND a.Heat <= 12 THEN 2
                        WHEN a.Heat > 12 THEN 3
                    END AS Grp,
                    ROW_NUMBER() OVER(ORDER BY a.Dt) AS RowNum
            FROM    @TestData a
    ),  CteRecursive
    AS
    (
            SELECT  s.RowNum,
                    s.Dt,
                    s.Heat,
                    s.Grp,
                    1 AS DENSE_RANK_OVER_ORDERBY_PARTITIONBY
            FROM    CteSource s
            WHERE   s.RowNum = 1
            UNION ALL
            SELECT  crt.RowNum,
                    crt.Dt,
                    crt.Heat,
                    crt.Grp,
                    CASE 
                        WHEN crt.Grp = prev.Grp THEN prev.DENSE_RANK_OVER_ORDERBY_PARTITIONBY 
                        ELSE prev.DENSE_RANK_OVER_ORDERBY_PARTITIONBY + 1
                    END
            FROM    CteSource crt
            INNER JOIN CteRecursive prev ON crt.RowNum = prev.RowNum + 1
    )
    SELECT  r.DENSE_RANK_OVER_ORDERBY_PARTITIONBY, 
            MAX(r.Grp) AS Grp,
            COUNT(*) AS Cnt,
            MIN(r.Dt) AS MinDt,
            MAX(r.Dt) AS MaxDt
    FROM    CteRecursive r
    GROUP BY r.DENSE_RANK_OVER_ORDERBY_PARTITIONBY;

结果:

DENSE_RANK_OVER_ORDERBY_PARTITIONBY Grp         Cnt         MinDt                   MaxDt
----------------------------------- ----------- ----------- ----------------------- -----------------------
1                                   2           2           2012-01-01 12:00:00     2012-01-01 12:03:00
2                                   1           5           2012-01-01 12:06:00     2012-01-01 12:18:00
3                                   2           1           2012-01-01 12:21:00     2012-01-01 12:21:00
4                                   3           5           2012-01-01 12:24:00     2012-01-01 12:36:00
5                                   2           1           2012-01-01 12:39:00     2012-01-01 12:39:00
6                                   1           4           2012-01-01 12:42:00     2012-01-01 12:51:00
7                                   2           2           2012-01-01 12:54:00     2012-01-01 12:57:00
8                                   1           1           2012-01-01 13:00:00     2012-01-01 13:00:00

2)第二个解决方案(SQL2012;更好的性能)

SELECT  d.DENSE_RANK_OVER_ORDERBY_PARTITIONBY,
        MAX(d.Grp) AS Grp,
        MIN(d.Dt) AS MinDt,
        MAX(d.Dt) AS MaxDt
FROM
(
        SELECT  c.*,
                1+SUM(c.IsNewGroup) OVER(ORDER BY c.Dt) AS DENSE_RANK_OVER_ORDERBY_PARTITIONBY
        FROM
        (
                SELECT  b.*,
                        CASE 
                            WHEN LAG(b.Grp) OVER(ORDER BY b.Dt) <> b.Grp THEN 1  
                            ELSE 0
                        END
                        AS IsNewGroup
                FROM    
                (
                        SELECT  a.*,
                                CASE 
                                    WHEN a.Heat >= 0 AND a.Heat <= 7 THEN 1
                                    WHEN a.Heat >= 8 AND a.Heat <= 12 THEN 2
                                    WHEN a.Heat > 12 THEN 3
                                END AS Grp
                        FROM    @TestData a
                ) b
        ) c
) d
GROUP BY d.DENSE_RANK_OVER_ORDERBY_PARTITIONBY;

答案 1 :(得分:1)

以下是SQL Server 2005或更新版本的替代解决方案:

WITH auxiliary (HeatID, MinHeat, MaxHeat, HeatDescr) AS (
  SELECT 1, 0 , 7   , '0-7'  UNION ALL
  SELECT 2, 8 , 12  , '8-12' UNION ALL
  SELECT 3, 13, NULL, 'more than 12'
),
datagrouped AS (
  SELECT
    d.*,
    a.HeatDescr,
    grp = ROW_NUMBER() OVER (                      ORDER BY d.Date)
        - ROW_NUMBER() OVER (PARTITION BY a.HeatID ORDER BY d.Date)
  FROM data d
    INNER JOIN auxiliary a
      ON d.Heat BETWEEN a.MinHeat AND ISNULL(a.MaxHeat, 0x7fffffff)
)
SELECT
  HeatDescr,
  DateFrom  = MIN(Date),
  DateTo    = MAX(Date),
  ItemCount = COUNT(*)
FROM datagrouped
GROUP BY
  HeatDescr, grp
ORDER BY
  MIN(Date)

data的定义如下:

CREATE TABLE data (Date datetime, Heat int);

INSERT INTO data (Date, Heat)
SELECT '01/01/2012 12:00',  8  UNION ALL
SELECT '01/01/2012 12:03',  9  UNION ALL
SELECT '01/01/2012 12:06',  5  UNION ALL
SELECT '01/01/2012 12:09',  3  UNION ALL
SELECT '01/01/2012 12:12',  6  UNION ALL
SELECT '01/01/2012 12:15',  7  UNION ALL
SELECT '01/01/2012 12:18',  1  UNION ALL
SELECT '01/01/2012 12:21',  12 UNION ALL
SELECT '01/01/2012 12:24',  28 UNION ALL
SELECT '01/01/2012 12:27',  25 UNION ALL
SELECT '01/01/2012 12:30',  20 UNION ALL
SELECT '01/01/2012 12:33',  20 UNION ALL
SELECT '01/01/2012 12:36',  20 UNION ALL
SELECT '01/01/2012 12:39',  12 UNION ALL
SELECT '01/01/2012 12:42',  6  UNION ALL
SELECT '01/01/2012 12:45',  3  UNION ALL
SELECT '01/01/2012 12:48',  5  UNION ALL
SELECT '01/01/2012 12:51',  7  UNION ALL
SELECT '01/01/2012 12:54',  11 UNION ALL
SELECT '01/01/2012 12:57',  12 UNION ALL
SELECT '01/01/2012 13:00',  6;

对于上面的示例,查询提供以下输出:

HeatDescr     DateFrom             DateTo               ItemCount 
------------  -------------------  -------------------  --------- 
8-12          2012-01-01 12:00:00  2012-01-01 12:03:00  2    
0-7           2012-01-01 12:06:00  2012-01-01 12:18:00  5    
8-12          2012-01-01 12:21:00  2012-01-01 12:21:00  1    
more than 12  2012-01-01 12:24:00  2012-01-01 12:36:00  5    
8-12          2012-01-01 12:39:00  2012-01-01 12:39:00  1    
0-7           2012-01-01 12:42:00  2012-01-01 12:51:00  4    
8-12          2012-01-01 12:54:00  2012-01-01 12:57:00  2    
0-7           2012-01-01 13:00:00  2012-01-01 13:00:00  1    

答案 2 :(得分:0)

你应该使用RANK()

达到目标

http://msdn.microsoft.com/en-us/library/ms176102.aspx

这样的东西
SELECT date, heat, RANK() OVER (PARTITION BY heat ORDER BY date DESC) AS Rank
FROM tbl

然后你可以在之后对它进行分组,或者根据你的结果进行更多的子选择和联合。