棘手的SQL查询 - 需要获取时间范围

时间:2012-10-31 22:28:13

标签: mysql tsql

当我需要一个可以生成超速时间框列表的查询时,我偶然发现了一个问题。

这是数据示例

[idgps_unit_location]   [dt]    [idgps_unit]    [lat]   [long]  [speed_kmh]
26  10/18/2012 18:53    2   47  56  30
27  10/18/2012 18:53    2   49  58  31
28  10/18/2012 18:53    2   28  37  15
29  10/18/2012 18:54    2   56  65  33
30  10/18/2012 18:54    2   152 161 73
31  10/18/2012 18:55    2   134 143 64
32  10/18/2012 18:56    2   22  31  12
36  10/18/2012 18:59    2   98  107 47
37  10/18/2012 18:59    2   122 131 58
38  10/18/2012 18:59    2   91  100 44
39  10/18/2012 19:00    2   190 199 98
40  10/18/2012 19:01    2   194 203 101
41  10/18/2012 19:02    2   182 191 91
42  10/18/2012 19:03    2   162 171 78
43  10/18/2012 19:03    2   174 183 83
44  10/18/2012 19:04    2   170 179 81
45  10/18/2012 19:05    2   189 198 97
46  10/18/2012 19:06    2   20  29  10
47  10/18/2012 19:07    2   158 167 76
48  10/18/2012 19:08    2   135 144 64
49  10/18/2012 19:08    2   166 175 79
50  10/18/2012 19:09    2   9   18  5
51  10/18/2012 19:09    2   101 110 48
52  10/18/2012 19:09    2   10  19  7
53  10/18/2012 19:10    2   32  41  20
54  10/18/2012 19:10    1   54  63  85
55  10/19/2012 19:11    2   55  64  50

我需要一个查询,将此表格转换为以下报告,该报告显示速度> 80时的时间范围:

[idgps_unit]    [dt_start]  [lat_start] [long_start]    [speed_start]   [dt_end]    [lat_end]   [long_end]  [speed_end] [speed_average]
2   10/18/2012 19:00    190 199 98  10/18/2012 19:02    182 191 91  96.66666667
2   10/18/2012 19:03    174 183 83  10/18/2012 19:05    189 198 97  87
1   10/18/2012 19:10    54  63  85  10/18/2012 19:10    54  63  85  85

现在,我尝试了什么?我尝试将它放入单独的表,查询并进行一些连接......没有任何作用,我非常沮丧......我甚至不确定是否可以通过查询来完成。请求专家帮助!

2 个答案:

答案 0 :(得分:3)

你是对的,这是相当棘手的,但我认为我已经成功了:

SELECT  s.idgps_unit,
        MIN(s.dt) AS DT_Start,
        MIN(CASE WHEN s.RowNumber = 1 THEN s.Lat END) AS Lat_Start,
        MIN(CASE WHEN s.RowNumber = 1 THEN s.Long END) AS Long_Start,
        MIN(CASE WHEN s.RowNumber = 1 THEN s.Speed_kmh END) AS Speed_Start,
        MAX(s.dt) AS dt_end,
        MIN(CASE WHEN s.RowNumber = MaxRowNumber THEN s.Lat END) AS Lat_End,
        MIN(CASE WHEN s.RowNumber = MaxRowNumber THEN s.Long END) AS Long_End,
        MIN(CASE WHEN s.RowNumber = MaxRowNumber THEN s.Speed_kmh END) AS Speed_End,

        AVG(Speed_kmh) AS Speed_Average
FROM    (   SELECT  T.*,
                    @i:= CASE WHEN Speed_Kmh > 80 AND @b = 0 THEN @i + 1 ELSE @i END AS IntervalID,
                    @r:= CASE WHEN Speed_Kmh > 80 AND @b = 0 THEN 1 ELSE @r + 1 END AS RowNumber,
                    @b:= CASE WHEN Speed_Kmh> 80 THEN 1 ELSE 0 END AS IntervalCheck
            FROM    T,
                    (SELECT @i:= 0) i,
                    (SELECT @r:= 0) r,
                    (SELECT @b:= 0) b
            ORDER BY dt, idgps_unit_location
        ) s
        INNER JOIN
        (   SELECT  IntervalID, MAX(RowNumber) AS MaxRowNumber
            FROM    (   SELECT  T.*,
                                @i:= CASE WHEN Speed_Kmh > 80 AND @b = 0 THEN @i + 1 ELSE @i END AS IntervalID,
                                @r:= CASE WHEN Speed_Kmh > 80 AND @b = 0 THEN 1 ELSE @r + 1 END AS RowNumber,
                                @b:= CASE WHEN Speed_Kmh> 80 THEN 1 ELSE 0 END AS IntervalCheck
                        FROM    T,
                                (SELECT @i:= 0) i,
                                (SELECT @r:= 0) r,
                                (SELECT @b:= 0) b
                        ORDER BY dt, idgps_unit_location
                    ) d
            WHERE   IntervalCheck = 1
            GROUP BY IntervalID
        ) MaxInt
            ON MaxInt.IntervalID = s.IntervalID
WHERE   s.IntervalCheck = 1
GROUP BY s.IntervalID, s.idgps_unit;

关键在于这部分:

SELECT  T.*,
        @i:= CASE WHEN Speed_Kmh > 80 AND @b = 0 THEN @i + 1 ELSE @i END AS IntervalID,
        @r:= CASE WHEN Speed_Kmh > 80 AND @b = 0 THEN 1 ELSE @r + 1 END AS RowNumber,
        @b:= CASE WHEN Speed_Kmh> 80 THEN 1 ELSE 0 END AS IntervalCheck
FROM    T,
        (SELECT @i:= 0) i,
        (SELECT @r:= 0) r,
        (SELECT @b:= 0) b
ORDER BY dt, idgps_unit_location

每次遇到速度超过的行时,它会将变量@b设置为1,如果此变量为0,则在为行分配新的intervalID之前,如果执行此操作,则会再次将行编号为1,所以你最终得到这样的东西:

[idgps_unit_location]   [dt]                [idgps_unit]    [lat]   [long]  [speed_kmh] [IntervalID]    RowNumber   IntervalCheck
37                      10/18/2012 18:59    2               122     131     58          1               1           0
38                      10/18/2012 18:59    2               91      100     44          1               2           0
39                      10/18/2012 19:00    2               190     199     98          2               1           1
40                      10/18/2012 19:01    2               194     203     101         2               2           1
41                      10/18/2012 19:02    2               182     191     91          2               3           1
42                      10/18/2012 19:03    2               162     171     78          2               4           0
43                      10/18/2012 19:03    2               174     183     83          3               1           1

然后你需要对速度低于80的所有行进行elimate(WHERE IntervalCheck = 1),最后你可以使用聚合函数和CASE来查找RowNumber为1的行(第一行超速),或该间隔的最高rownumber(最后一排超速)。最后的连接只是重复该过程以找到每个intervalID的最大rownumber值。

<强> Example on SQL Fiddle

答案 1 :(得分:0)

您是否尝试过这样的事情(省略平均速度计算):

SELECT * FROM (
SELECT
   start.idgps_unit,
   start.dt dt_start,
   ...
   end.dt dt_end,
   ...
   (...) average_speed
FROM
   your_table start,
   your_table end
WHERE
   start.dt < end.dt
)
WHERE average_speed > 80

这会给你带来很多重叠的时间表,不确定是否需要。如果没有,您可以使用NOT EXISTS过滤:

SELECT *
FROM (query_above) timeframes
WHERE NOT EXISTS (
SELECT *
FROM (query_above) longer_timeframes
WHERE 
   longer_timeframes.dt_start < timeframes.dt_end OR
   longer_timeframes.dt_end > timeframes.dt_end
)

这可能会让你有些重叠,例如如果你从19:00到19:03,从19:03到19:07从100,从19:07到19:10再从60。然后你有两个最大长度的时间间隔,其中平均速度大于80,一个从19:00到19:07,另一个从19:03到19:10。