使用自联接T-SQL获取超过1小时的数据

时间:2017-11-08 18:49:04

标签: sql sql-server tsql self-join

我有这样一张桌子(claimTable),

 Time              Terminal_ID  Claims data_from
------------------------------------------------

2017-10-19 06:03:00     1        561     2
2017-10-19 06:04:00     1        562     2 
2017-10-19 06:05:00     1        562.3   2
2017-10-19 06:06:00     1        563     2
2017-10-19 06:03:00     9        471     2
2017-10-19 06:04:00     9        471.9   2
2017-10-19 06:05:00     9        472.3   2
2017-10-19 06:06:00     9        473     2
2017-10-19 06:07:00     1        567     1
2017-10-19 06:08:00     1        567.6   1 
2017-10-19 06:09:00     1        568.2   1
2017-10-19 06:10:00     1        569     1
2017-10-19 06:07:00     9        475     1
2017-10-19 06:08:00     9        475.9   1
2017-10-19 06:09:00     9        476.3   1
2017-10-19 06:10:00     9        476.3   1

每个ID都有数天的数据。我只在上面显示了一些数据。现在,我从Terminal_ID = 1检查每个data_from的最旧数据,

select min(Time), Terminal_ID
from claimsTable 
where data_from = 1 
group by Terminal_ID

我为每个ID获得2017-10-19 06:07:00

接下来,我会检查Terminal_ID = 2中每个data_from的最新数据,

select max(Time), Terminal_ID 
from claimsTable 
where data_from = 2
group by Terminal_ID

现在,我为每个2017-10-19 06:06:00获得Terminal_ID

现在,我希望从最新时间(最大(TIME)data_from = 2)获得60分钟的数据来计算每小时的平均值,该平均值来自{{1}的最旧数据} = 1.

所以,我使用像这样的自联接进行了检查,

data_from

这不会给我所需的检查,因为我没有在select t1.[Time], t1. Terminal_ID from claimsTable t1 inner join claimsTable t2 on t1.Terminal_ID = t2. Terminal_ID where t1. Terminal_ID = t2. Terminal_ID and t1.[Time] between dateadd(mi,-59,t2.[Time]) and t1.[Time] 中使用maxt2函数min。当我自我加入时,我不确定如何包含它们。

我的预期输出表:

t1

如何检查Time Terminal_ID Claims data_from ------------------------------------------------ 2017-10-19 06:03:00 1 561 2 2017-10-19 06:04:00 1 562 2 2017-10-19 06:05:00 1 562.3 2 2017-10-19 06:06:00 1 563 2 2017-10-19 06:07:00 1 567 1 2017-10-19 06:03:00 9 471 2 2017-10-19 06:04:00 9 471.9 2 2017-10-19 06:05:00 9 472.3 2 2017-10-19 06:06:00 9 473 2 2017-10-19 06:07:00 9 475 1 = 2中之前60分钟的数据,从data_from = 1开始查找最旧的数据?

2 个答案:

答案 0 :(得分:1)

我试图计算Avgmintime(data_from = 1 )之间所有记录的(Max - 60 mins AND data_from == 2)

问题是我得到NULL这两种情况可能是因为查询中缺少某些内容 - 或者数据不够

尝试运行查询,如果有任何问题让我知道。

这是查询创建样本数据的查询:

    -- CREATE SAMPLE DATA
    DROP TABLE #claimsTable

    CREATE TABLE #claimsTable
    (
        [Time] DateTime,
        Terminal_ID INT,
        Claims FLOAT,
        data_from INT

    )

    INSERT INTO #claimsTable
    VALUES

        (N'2017-10-19 06:03:00',     1 ,       561  ,   2),
        (N'2017-10-19 06:04:00',     1 ,       562  ,   2), 
        (N'2017-10-19 06:05:00',     1 ,       562.3,   2),
        (N'2017-10-19 06:06:00',     1 ,       563  ,   2),
        (N'2017-10-19 06:03:00',     9 ,       471  ,   2),
        (N'2017-10-19 06:04:00',     9 ,       471.9,   2),
        (N'2017-10-19 06:05:00',     9 ,       472.3,   2),
        (N'2017-10-19 06:09:00',     9 ,       473  ,   2),
        (N'2017-10-19 06:07:00',     1 ,       567  ,   1),
        (N'2017-10-19 06:08:00',     1 ,       567.6,   1), 
        (N'2017-10-19 06:09:00',     1 ,       568.2,   1),
        (N'2017-10-19 06:10:00',     1 ,       569  ,   1),
        (N'2017-10-19 06:05:00',     9 ,       475  ,   1),
        (N'2017-10-19 06:08:00',     9 ,       475.9,   1),
        (N'2017-10-19 06:09:00',     9 ,       476.3,   1)

我在示例数据中更改了1到2次,以便更好地理解我的查询生成的结果

实际查询从这里开始:

    Select
        A.TerminalId,
        Avrg = AVG(data_between.Claims)
    From
        (
        -- this inner query returns
        /*
            TerminalId  |           MaxTime (data_from == 2)    |   Min Time (data_from == 1)
            -------------------------------------------------------------------------------------
            9           |   2017-10-19 06:09:00.000             |   2017-10-19 06:05:00.000
            1           |   2017-10-19 06:06:00.000             |   2017-10-19 06:07:00.000
        */

            Select
                TerminalId = data_from_2.Terminal_ID,
                MaxTime2 = MAX(data_from_2.[Time]),
                MinTime1 = data_from_1.[Time]

            From
                #claimsTable data_from_2
                -- This will get MIN the data_from = 1 for each terminal_id
                CROSS APPLY (
                    SELECT TOP (1) 
                        * 
                    FROM #claimsTable  a

                    WHERE a.data_from = 1 AND a.Terminal_ID = data_from_2.Terminal_ID

                    ORDER BY a.[Time] ASC

                ) data_from_1
            --
            Where data_from_2.data_from = 2
            -- group by to get the Max.Time for each terminal
            GROUP BY data_from_2.Terminal_ID, data_from_1.[Time]
        ) A

        -- join with claimsTable again to get the data between mintime(data_from = 1 ) and (Max - 60 mins) so we can calculate avg

        LEFT JOIN #claimsTable data_between on data_between.Terminal_ID = A.TerminalId AND data_between.[Time] BETWEEN A.MinTime1 AND DATEADD(MINUTE, -60, A.MaxTime2)
        --
        GROUP BY A.TerminalId

答案 1 :(得分:0)

只需在自联接的时间列上使用max函数。

select 
      max(t2.[Time]), 
      t1.Terminal_ID
from 
    claimsTable t1
full join 
    claimsTable t2
    on t1.Terminal_ID = t2. Terminal_ID
    and t2.time <= dateadd(mi,-59,t1.[Time]) 
group by
    t1. Terminal_ID