SQL查询 - 从每小时总和中查找每日MIN值

时间:2015-03-26 16:48:19

标签: sql sql-server tsql

让我们切入追逐。我有一个看起来像这样的表(使用SQL Server 2014):

样本: http://sqlfiddle.com/#!6/75f4a/1/0

CREATE TABLE TAB (
    DT datetime,
    VALUE float
);

INSERT INTO TAB VALUES
('2015-05-01 06:00:00', 12),
('2015-05-01 06:20:00', 10),
('2015-05-01 06:40:00', 11),
('2015-05-01 07:00:00', 14),
('2015-05-01 07:20:00', 15),
('2015-05-01 07:40:00', 13),
('2015-05-01 08:00:00', 10),
('2015-05-01 08:20:00', 9),
('2015-05-01 08:40:00', 5),

('2015-05-02 06:00:00', 19),
('2015-05-02 06:20:00', 7),
('2015-05-02 06:40:00', 11),
('2015-05-02 07:00:00', 9),
('2015-05-02 07:20:00', 7),
('2015-05-02 07:40:00', 6),
('2015-05-02 08:00:00', 10),
('2015-05-02 08:20:00', 19),
('2015-05-02 08:40:00', 15),

('2015-05-03 06:00:00', 8),
('2015-05-03 06:20:00', 8),
('2015-05-03 06:40:00', 8),
('2015-05-03 07:00:00', 21),
('2015-05-03 07:20:00', 12),
('2015-05-03 07:40:00', 7),
('2015-05-03 08:00:00', 10),
('2015-05-03 08:20:00', 4),
('2015-05-03 08:40:00', 10)

我需要:

  • 每小时总和值
  • 选择每天最小的“每小时总和”
  • 选择发生该金额的小时

换句话说,我希望有一个看起来像这样的表:

DATE |  SUM VAL | ON HOUR
--------------------------
2015-03-01 | 24 | 8:00 
2015-03-02 | 22 | 7:00 
2015-03-03 | 24 | 6:00 

前两点很容易(查看sqlfiddle)。我有第三个问题。我不能只喜欢选择Datepart(HOUR,DT),因为它必须被聚合。我试图使用JOINS和WHERE子句,但没有成功(表中可能会多次出现一些值,这会引发错误)。

我对SQL很新,但我遇到了困难。需要你的帮助! :)

5 个答案:

答案 0 :(得分:2)

DECLARE @TAB TABLE
    (
      DT DATETIME ,
      VALUE FLOAT
    );

INSERT  INTO @TAB
VALUES  ( '2015-05-01 06:00:00', 12 ),
        ( '2015-05-01 06:20:00', 10 ),
        ( '2015-05-01 06:40:00', 11 ),
        ( '2015-05-01 07:00:00', 14 ),
        ( '2015-05-01 07:20:00', 15 ),
        ( '2015-05-01 07:40:00', 13 ),
        ( '2015-05-01 08:00:00', 10 ),
        ( '2015-05-01 08:20:00', 9 ),
        ( '2015-05-01 08:40:00', 5 ),
        ( '2015-05-02 06:00:00', 19 ),
        ( '2015-05-02 06:20:00', 7 ),
        ( '2015-05-02 06:40:00', 11 ),
        ( '2015-05-02 07:00:00', 9 ),
        ( '2015-05-02 07:20:00', 7 ),
        ( '2015-05-02 07:40:00', 6 ),
        ( '2015-05-02 08:00:00', 10 ),
        ( '2015-05-02 08:20:00', 19 ),
        ( '2015-05-02 08:40:00', 15 ),
        ( '2015-05-03 06:00:00', 8 ),
        ( '2015-05-03 06:20:00', 8 ),
        ( '2015-05-03 06:40:00', 8 ),
        ( '2015-05-03 07:00:00', 21 ),
        ( '2015-05-03 07:20:00', 12 ),
        ( '2015-05-03 07:40:00', 7 ),
        ( '2015-05-03 08:00:00', 10 ),
        ( '2015-05-03 08:20:00', 4 ),
        ( '2015-05-03 08:40:00', 10 );
WITH    cteh
          AS ( SELECT   DT ,
                        CAST(dt AS DATE) AS D ,
                        SUM(VALUE) OVER ( PARTITION BY CAST(dt AS DATE),
                                          DATEPART(hh, DT) ) AS S
               FROM     @TAB
             ),
        ctef
          AS ( SELECT   * ,
                        ROW_NUMBER() OVER ( PARTITION BY D ORDER BY S ) AS rn
               FROM     cteh
             )
    SELECT  D ,
            S ,
            CAST(DT AS TIME) AS H
    FROM    ctef
    WHERE   rn = 1

输出:

D           S   H
2015-05-01  24  08:00:00.0000000
2015-05-02  22  07:00:00.0000000
2015-05-03  24  06:00:00.0000000

答案 1 :(得分:1)

一种方法是使用具有最小每小时值的集合作为派生表并加入其中。我会做这样的事情:

;WITH CTE AS (
    SELECT Cast(Format(DT, 'yyyy-MM-dd HH:00') AS datetime) AS DT, SUM(VALUE) AS VAL
    FROM TAB
    GROUP BY Format(DT, 'yyyy-MM-dd HH:00')
) 

SELECT b.dt "Date", val "sum val", cast(min(a.dt) as time) "on hour"
FROM cte a JOIN (
    SELECT Format(DT,'yyyy-MM-dd') AS DT, MIN(VAL) AS DAILY_MIN 
    FROM cte HOURLY
    GROUP BY Format(DT,'yyyy-MM-dd')
) b ON CAST(a.DT AS DATE) = b.DT and a.VAL = b.DAILY_MIN
GROUP BY b.DT, a.VAL

这将得到:

Date        sum val on hour
2015-05-01  24      08:00:00.0000000
2015-05-02  22      07:00:00.0000000
2015-05-03  24      06:00:00.0000000

我使用min()作为时间部分,因为您的样本数据具有相同的低值,对于第3个单独的两个小时。如果你想要两个,那么从外部select和group by中删除min函数。然后你会得到:

Date        sum val on hour
2015-05-01  24      08:00:00.0000000
2015-05-02  22      07:00:00.0000000
2015-05-03  24      06:00:00.0000000
2015-05-03  24      08:00:00.0000000

我确信它可以改进,但你应该明白这一点。

答案 2 :(得分:1)

这是一种方法,它使用临时表(与其他解决方案中的CTE相对)来存储计算值,然后过滤结果以获得所需的输出:

-- INSERT CALCULATED GROUPED VALUES INTO TEMP TABLE
SELECT  CONVERT(DATE, DT) AS DateVal ,
        SUM(VALUE) AS SumVal ,
        DATEPART(HOUR, CONVERT(TIME, DT)) AS HourVal
INTO    #TEMP_CALC
FROM    TAB
GROUP BY CONVERT(DATE, DT) , DATEPART(HOUR, CONVERT(TIME, DT))

-- TAKE THE RELEVANT ROWS
SELECT  t.DateVal ,
        MIN(t.SumVal) AS SumVal ,
        ( SELECT TOP 1
                    HourVal
          FROM      #TEMP_CALC t2
          WHERE     t2.DateVal = t.DateVal
                    AND t2.SumVal = MIN(t.SumVal)
        ) AS MinHour
FROM    #TEMP_CALC t
GROUP BY t.DateVal
ORDER BY DateVal

答案 3 :(得分:0)

您可以使用DATEDIFF以小时和天为单位从任何起始时间点(此示例中为1990-1-1)获取时间跨度。跨越用于分组和排序的用法,最后使用具有相同起点的DATEADD来重建它:

WITH dates AS (
  SELECT CAST(DT AS DATETIME) AS Date, -- cast the value to date
  value FROM dbo.TAB AS T
),
ddh AS (SELECT 
    date,
    DATEDIFF(DAY, '1990-1-1', date) AS daySpan,    -- days span
    DATEDIFF(HOUR, '1990-1-1', date) AS hourSpan,  -- hours span
    value
    FROM dates
),
ddhv AS ( SELECT
    daySpan,
    hourSpan,
    SUM(value) AS sumValues    -- sum...
    FROM ddh
    group BY daySpan, hourSpan -- ...grouped by day & hour
),
ddhvr AS ( SELECT
    daySpan,
    hourSpan,
    sumValues,
    -- number rows by hourly sum of the value
    ROW_NUMBER() OVER (PARTITION BY daySpan ORDER BY sumValues) AS row
FROM ddhv
)
SELECT
    DATEADD(HOUR, hourSpan, '1990-1-1') AS DayHour, -- rebuild the date/hour
    sumValues
FROM ddhvr
WHERE row = 1 -- take only the first occurrence for each day

此查询的优点是您可以轻松更改句点和起点。例如,您可以在6:30 AM而不是00:00开始您的日期,以便比较时间段为6:30至7:30,7:30至8:30并继续。您也可以更改分组单位,例如,而不是1小时,它可能是半小时,或5分钟或2小时。如果您需要这样做,请see this SO answer。在那里,您将看到如何在不同时期进行分组,并回到期间的起始点。这只是一些简单的数学。

答案 4 :(得分:0)

我测试了我的小提琴:

with agg as (
    select cast(dt as date) as dt, datepart(hh, dt) as hr, sum(VALUE) as sum_val
    from TAB
    group by cast(dt as date), datepart(hh, dt)
)
select
    dt, min(sum_val) as "SUM VAL",
    (
        select cast(hr as varchar(2)) + ':00' from agg as agg2
        where agg2.dt = agg.dt and not exists (
            /* select earliest in case of ties */
            select 1 from agg as agg3
            where agg3.dt = agg2.dt and agg3.sum_val >= agg3.sum_val and agg3.hr > agg2.hr
        )
    ) as "ON HOUR"
from agg
group by dt;