如何确定持续时间

时间:2014-06-11 16:50:51

标签: sql

注意:这是高级SQL ...(恕我直言)

我想要做的总结:

我有一个应用程序,它使用时间戳将数据放入数据库。我想做的是按时间顺序查看这些数据,并确定一个" start"之间的持续时间。并且"停止"点

这是UserRoadData表:

    CREATE TABLE [dbo].[UserRoadData](
        [Indx] [uniqueidentifier] NOT NULL,
        [Indx_User] [uniqueidentifier] NOT NULL,
        [Indx_RoadPoint] [uniqueidentifier] NOT NULL,
        [TimeHit] [datetime] NOT NULL,
        [Indx_UserRoadDataStatus] [int] NOT NULL,
     CONSTRAINT [PK_UserRoadData] PRIMARY KEY CLUSTERED 

这是RoadPoints表:

    CREATE TABLE [dbo].[RoadPoints](
        [Indx] [uniqueidentifier] NOT NULL,
        [Indx_Road] [uniqueidentifier] NOT NULL,
        [Indx_PointType] [int] NOT NULL,
        [Sequence] [int] NOT NULL,
        [SegmentNumber] [int] NOT NULL,
        [GeoLoc] [geography] NOT NULL,
     CONSTRAINT [PK_RoadPoints] PRIMARY KEY CLUSTERED 

我有以下查询尝试确定持续时间 (注意:此时@NotProcessed = 0)

        declare @Now datetime
        set @Now = GETDATE();

        with CTE_1 as( 
        SELECT
                Indx_RoadPoint
                ,Indx_Road
                ,SegmentNumber
                ,Indx_UserRoadDataStatus
                ,case WHEN Indx_PointType = @SegmentStart THEN TimeHit
                    ELSE NULL
                END AS InTime
                ,case WHEN Indx_PointType = @SegmentEnd THEN TimeHit
                    ELSE NULL
                END AS OutTime
                ,row_number() OVER ( ORDER BY TimeHit ASC ) AS rowNum
                FROM
                UserRoadData INNER JOIN RoadPoints ON UserRoadData.Indx_RoadPoint = RoadPoints.Indx 
                where 
                Indx_User = @Indx_User
                and
                Indx_Road = @Indx_Road
                and
                Indx_UserRoadDataStatus = @NotProcessed
                ),

          CTE_2 -- mark those that repeat
            AS ( SELECT
                t.Indx_RoadPoint
                ,t.Indx_Road
                ,t.SegmentNumber
                ,t.InTime
                ,t.OutTime
                ,t.rowNum
                ,case WHEN ( SELECT Indx_RoadPoint
                              FROM CTE_1 AS x
                              WHERE x.rowNum = t.rowNum - 1
                            ) = t.Indx_RoadPoint THEN 1
                       ELSE 0
                  END AS mrk
                 FROM CTE_1 AS t
               ),

          CTE_3 --extract non repeats and group
            AS ( SELECT
                  *
                 ,row_number() OVER ( PARTITION BY Indx_RoadPoint ORDER BY rowNum ASC ) AS rn2
                 FROM CTE_2
                 WHERE  mrk = 0
               )
      SELECT
        newid() as indx  -- this table needs to have the same columns as UserRideData
        ,t1.Indx_Road
        ,@Indx_User as Indx_User
        ,t1.SegmentNumber
        ,t1.InTime
        ,t2.OutTime
        ,datediff(ss, t1.InTime, t2.OutTime) AS Duration
        ,@Now as DateRecorded

      INTO #RoadAndRoadData  -- results are put into a temp table...

      FROM
        CTE_3 AS t1
        JOIN CTE_3 AS t2 ON t1.rn2 = t2.rn2
      WHERE
        t1.Intime IS NOT NULL
        AND t2.OutTime IS NOT NULL

      GROUP BY
        t1.Indx_Road
        ,t1.SegmentNumber
        ,t1.InTime
        ,t2.OutTime


 -- insert the temp table data into the real table
 INSERT INTO dbo.UserRideData SELECT * FROM #RoadAndRoadData

这是CTE_1语句的整个输出,它包含正确的数据! Guids(indx)已被缩短以保护无辜者: - )

Indx    Indx_Road   Segment Status  InTime  OutTime rowNum

87502C53    28992B99    0   0   NULL    NULL    1
17BB6691    28992B99    0   0   NULL    NULL    2
C40FD10F    28992B99    1   0   2014-06-11 09:09:11.200 NULL    3
7BC5D0A6    28992B99    1   0   NULL    NULL    4
97BA8F20    28992B99    1   0   NULL    NULL    5
75A3F916    28992B99    1   0   NULL    NULL    6
FA2B73E5    28992B99    1   0   NULL    NULL    7
D1E16249    28992B99    1   0   NULL    NULL    8
BAB45A3C    28992B99    1   0   NULL    NULL    9
0EC3D9AD    28992B99    1   0   NULL    NULL    10
3A0BAF2A    28992B99    1   0   NULL    NULL    11
5B97F78A    28992B99    1   0   NULL    2014-06-11 09:09:20.200 12
E55C20C5    28992B99    2   0   2014-06-11 09:09:21.200 NULL    13
FBC14E4E    28992B99    2   0   NULL    NULL    14
5396D1FF    28992B99    2   0   NULL    NULL    15
63D5F64B    28992B99    2   0   NULL    NULL    16
A463F4FA    28992B99    2   0   NULL    2014-06-11 09:09:25.200 17
F6A528D8    28992B99    0   0   NULL    NULL    18
1D73335D    28992B99    0   0   NULL    NULL    19

正如您所看到的那样,有一个开始时间,然后是每个唯一段的结束时间,例如:

 Segment 1 indx C40FD10F has a non null start time
 Segment 1 indx 5B97F78A has a non null stop time  --- ( PAIR 1 )
 Segment 2 indx E55C20C5 has a non null start time
 Segment 2 indx A463F4FA has a non null stop time  --- ( PAIR 2 )

这是来自上述输出的真正重要数据,这个问题的确集中在从PAIR 1和PAIR 2中仅将持续时间转换为表格

Indx    Indx_Road   Segment Status  InTime  OutTime rowNum

C40FD10F    28992B99    1   0   2014-06-11 09:09:11.200 NULL    3
5B97F78A    28992B99    1   0   NULL    2014-06-11 09:09:20.200 12   --- ( PAIR 1 )

E55C20C5    28992B99    2   0   2014-06-11 09:09:21.200 NULL    13
A463F4FA    28992B99    2   0   NULL    2014-06-11 09:09:25.200 17   --- ( PAIR 2 )

当存储的proc在这里运行时是结果。注意第1段和第2段有两个持续时间......其中一个是误报

indx    Indx_Road   Indx_User   SegmentNumber   InTime  OutTime Duration    DateRecorded
382A9F0D    28992B99    22222222    1   2014-06-11 09:09:11.200 2014-06-11 09:09:20.200 9   2014-06-11 09:09:28.207
BC942182    28992B99    22222222    1   2014-06-11 09:09:11.200 2014-06-11 09:09:25.200 14  2014-06-11 09:09:28.207
548A0340    28992B99    22222222    2   2014-06-11 09:09:21.200 2014-06-11 09:09:20.200 -1  2014-06-11 09:09:28.207
E8322022    28992B99    22222222    2   2014-06-11 09:09:21.200 2014-06-11 09:09:25.200 4   2014-06-11 09:09:28.207

我希望只有一个持续时间,如此(来自上面的PAIR 1和PAIR 2)

indx    Indx_Road   Indx_User   SegmentNumber   InTime  OutTime Duration    DateRecorded
BC942182    28992B99    22222222    1   2014-06-11 09:09:11.200 2014-06-11 09:09:25.200 14  2014-06-11 09:09:28.207
E8322022    28992B99    22222222    2   2014-06-11 09:09:21.200 2014-06-11 09:09:25.200 4   2014-06-11 09:09:28.207

如果您能提供任何见解,我将不胜感激!

...谢谢

2 个答案:

答案 0 :(得分:1)

很难说,没有样本开始数据,但我同意@Kevin - 您的查询可以从一开始就简化。特别是,您的查询正在进行大量不需要的工作,而这通常会被抛弃。

此查询做出以下假设:

  1. 所有细分受众群都有InOut时间,且每个细分只有一个
  2. Out时间始终晚于In时间。
  3. SegmentNumber对于所有进/出对都是正确的(尽管可以在必要时生成)。
  4. 您应该可以使用接近以下内容的内容:

    SELECT NEWID() AS Indx,
           @Indx_Road AS Indx_Road, @Indx_User AS Indx_User,
           inTime, outTime, DATEDIFF(second, inTime, outTime) AS duration,
           GETDATE() AS dateRecorded
    FROM (SELECT Road.SegmentNumber,
                 MIN(UserRoad.timeHit) AS inTime, MAX(UserRoad.timeHit) AS outTime
          FROM UserRoadData UserRoad
          JOIN RoadPoints Road
            ON Road.Indx = UserRoad.Indx_RoadPoint
               AND Road.Indx_Road = @Indx_Road
               AND Road.Indx_PointType IN (@SegmentStart, @SegmentEnd)
          WHERE UserRoad.Indx_User = @Indx_User
                AND UserRoad.Indx_UserRoadDataStatus = @NotProcessed
          GROUP BY Road.SegmentNumber) SegmentTime
    

    我怀疑您是否能够使用索引直接回答GROUP BY,尽管其他条款应该严格限制您的起始设置。您只是丢弃了从CTE_1返回的大多数行,我甚至都不想包含它们。
    我不知道您的情况会对性能产生什么影响,但您应该可以直接将其插入到目标表中,而不会弄乱中间临时表。
    请注意,此查询的编写时假设您的输入参数实例很少,按需运行。如果您在批量条目上运行此操作,则查询应该更改。

答案 1 :(得分:0)

SELECT 
    CTE_2.Indx,
    CTE_2.Indx_Road,
    CTE_2.Indx_User,
      CTE_2.SegmentNumber,
    CTE_2.InTime,
    CTE_2.OutTime,
    DATEDIFF(SECOND, CTE_2.InTime, CTE_2.OutTime) AS Duration,
    GETDATE() AS DateRecorded
FROM
(
    SELECT 
        newid() AS Indx, 
        CTE_1.Indx_Road, 
        @Indx_User AS Indx_User,
                CTE_1.SegmentNumber,
        MAX(CTE_1.InTime) AS InTime,
        MAX(CTE_1.OutTime) AS OutTime
    FROM
    (
        SELECT
            urd.Indx_RoadPoint,
            rp.Indx_Road,
            rp.SegmentNumber,
            urd.Indx_UserRoadDataStatus,
            (CASE WHEN Indx_PointType = @SegmentStart THEN TimeHit ELSE NULL END) AS InTime,
            (CASE WHEN Indx_PointType = @SegmentEnd THEN TimeHit ELSE NULL END) AS OutTime,
            (row_number() OVER ( ORDER BY TimeHit ASC )) AS rowNum
        FROM UserRoadData urd
        INNER JOIN RoadPoints rp
            ON urd.Indx_RoadPoint = rp.Indx 
            AND (rp.Indx_PointType = @SegmentStart OR rp.Indx_PointType = @SegmentEnd)
        WHERE urd.Indx_User = @Indx_User
            and rp.Indx_Road = @Indx_Road
            and urd.Indx_UserRoadDataStatus = @NotProcessed
    ) AS CTE_1
    GROUP BY CTE_1.Indx_Road, CTE_1.SegmentNumber
) AS CTE_2