查询以查找列的两个连续值

时间:2018-04-10 16:37:18

标签: php mysql

我有一个名为events的表。它看起来像这样:

id | location_id | type    | date
1  | 123         | success | 2018-01-02
2  | 45          | success | 2018-01-13
3  | 123         | failure | 2018-01-23
4  | 66          | failure | 2018-02-04
5  | 123         | success | 2018-02-06
6  | 66          | failure | 2018-03-04

type列只能有两个值 - “成功”或“失败”。我需要完成的工作如下:查找location_id表中至少有两个连续条目events的每个type=failure。按日期订购条目时,即连续执行。在上面的示例中,只应返回location_id 66,因为type列中有两个连续失败。

显而易见的解决方案是:

iterate through location_ids
    get all entries from events table for each location_id, ordered by date
        iterate through the results and return true if we find two consecutive rows with type=failure

我遇到这种方法的问题:我有几千location_id个,每个events表中都有数百个条目。这意味着每次执行此任务时我们都可以拥有数十万个操作(这通常是因为其结果应显示在我们管理面板的主页上)。

所以我想知道是否有人知道更好的解决方案。我试过搜索一个查询来帮助我解决这个问题,但无济于事。

1 个答案:

答案 0 :(得分:1)

创建表/插入数据

CREATE TABLE events
    (`id` int, `location_id` int, `type` varchar(7), `date` date)
;

INSERT INTO events
    (`id`, `location_id`, `type`, `date`)
VALUES
    (1, 123, 'success', '2018-01-02'),
    (2, 45, 'success', '2018-01-13'),
    (3, 123, 'failure', '2018-01-23'),
    (4, 66, 'failure', '2018-02-04'),
    (5, 123, 'success', '2018-02-06'),
    (6, 66, 'failure', '2018-03-04')
;

对于这个解决方案,我假设当你连续说出你的意思时。

  1. 同一年和同一天的连续月份

    所以
    2018-02-04
    2018-03-04
    是一个连续的值

  2. 同一年和同月连续一天

    所以
    2018-02-04
    2018-02-05
    是一个连续的值

  3.   

    我们只需要以任何方式显示location_id,而不是日期   任何事情的最后失败。所以3个或更多的失败不应该成为   差

    要做的最好的事情是设计一个查询,该查询至少可以根据location_idtype组至少匹配2个或更多不同的日期记录,其中过滤器位于type = 'failure'

    <强>查询

    SELECT 
       location_id 
     , type
    FROM 
     events 
    WHERE
     type = 'failure'
    GROUP BY
       location_id 
     , type
    HAVING 
      COUNT(DISTINCT date) >= 2
    

    <强>结果

    | location_id |    type |
    |-------------|---------|
    |          66 | failure |
    

    参见演示http://sqlfiddle.com/#!9/df4679e/56

    现在我们使用INNER JOIN获取所有记录。

    <强>查询

    SELECT 
     events.*
    FROM ( 
    
      SELECT 
         location_id 
       , type
      FROM 
       events 
      WHERE
       type = 'failure'
      GROUP BY
         location_id 
       , type
      HAVING 
        COUNT(DISTINCT date) >= 2
    ) AS events_grouped
    
    INNER JOIN
     events
    ON
       events_grouped.location_id = events.location_id
     AND
       events_grouped.type = events.type
    

    <强>结果

    | id | location_id |    type |       date |
    |----|-------------|---------|------------|
    |  4 |          66 | failure | 2018-02-04 |
    |  6 |          66 | failure | 2018-03-04 |
    

    现在我们需要访问下一条记录。有些数据库支持LEAD。
    但目前生产就绪的MySQL版本并不支持 因此,我们将通过移动自连接来模拟LEAD。

    <强>查询

    SELECT 
       events1.*
     , events2.*
    FROM ( 
    
      SELECT 
         location_id 
       , type
      FROM 
       events 
      WHERE
       type = 'failure'
      GROUP BY
         location_id 
       , type
      HAVING 
        COUNT(DISTINCT date) >= 2
    ) AS events_grouped
    
    INNER JOIN
     events events1
    ON
       events_grouped.location_id = events1.location_id
     AND
       events_grouped.type = events1.type
    
    INNER JOIN 
     events events2
    ON
       # shift to have acces to the next record.
         events1.id <> events2.id 
       AND
         events1.date <= events2.date
    

    <强>结果

    | id | location_id |    type |       date | id | location_id |    type |       date |
    |----|-------------|---------|------------|----|-------------|---------|------------|
    |  4 |          66 | failure | 2018-02-04 |  5 |         123 | success | 2018-02-06 |
    |  4 |          66 | failure | 2018-02-04 |  6 |          66 | failure | 2018-03-04 |
    

    参见演示http://sqlfiddle.com/#!9/df4679e/62

    你可以清楚地看到记录在JOIN中移动,所以我们现在可以添加我正在谈论的连续值检查。

    最终查询

    SELECT 
     events1.location_id
    FROM ( 
    
      SELECT 
         location_id 
       , type
      FROM 
       events 
      WHERE
       type = 'failure'
      GROUP BY
         location_id 
       , type
      HAVING 
        COUNT(DISTINCT date) >= 2
    ) AS events_grouped
    
    INNER JOIN
     events events1
    ON
       events_grouped.location_id = events1.location_id
     AND
       events_grouped.type = events1.type
    
    INNER JOIN 
     events events2
    ON
       # shift to have acces to the next record.
         events1.id <> events2.id 
       AND
         events1.date <= events2.date
       AND
       (  
         (
           # check consecutive MONTH, YEAR and DAY need to be the same
    
           # consecutive month with the same year and same day
    
           # So <br />
           # 2018-02-04 <br />
           # 2018-03-04 <br />
           # is a consecutive value        
           ABS(YEAR(events1.date) - YEAR(events2.date)) = 0
         AND
           ABS(MONTH(events1.date) - MONTH(events2.date)) = 1
         AND
           ABS(DAY(events1.date) - DAY(events2.date)) = 0   
         )
         OR
         (
           # check consecutive DAY, YEAR and MONTH need to be the same
    
           # consecutive month with the same year and same day
    
           # So <br />
           # 2018-02-04 <br />
           # 2018-02-05 <br />
           # is a consecutive value 
             ABS(YEAR(events1.date) - YEAR(events2.date)) = 0
           AND
             ABS(MONTH(events1.date) - MONTH(events2.date)) = 0
           AND
             ABS(DAY(events1.date) - DAY(events2.date)) = 1   
         )   
       )
    

    <强>结果

    | location_id |
    |-------------|
    |          66 |
    

    参见演示http://sqlfiddle.com/#!9/df4679e/65