SQL - 按条目分组表格X彼此之间的时间量

时间:2016-02-07 06:48:32

标签: mysql sql google-bigquery

我需要将这些条目组合在一起,其中一个和另一个之间的时间戳差异是X秒或小于每个设备的每个值的平均值。在下面的示例中,我有一个包含此数据的表,我需要按设备进行分组,条目之间的条目相隔60秒。

          Device            Timestamp  Value
0  30:8c:fb:a4:b9:8b  10/26/2015 22:50:15     34
1  30:8c:fb:a4:b9:8b  10/26/2015 22:50:46     34
2  c0:ee:fb:35:ec:cd  10/26/2015 22:50:50     33
3  c0:ee:fb:35:ec:cd  10/26/2015 22:50:51     32
4  30:8c:fb:a4:b9:8b  10/26/2015 22:51:15     34
5  30:8c:fb:a4:b9:8b  10/26/2015 22:51:47     32
6  c0:ee:fb:35:ec:cd  10/26/2015 22:52:38     38
7  30:8c:fb:a4:b9:8b  10/26/2015 22:54:46     34

这应该是结果表

          Device           First_seen            Last_seen Average_value
0  30:8c:fb:a4:b9:8b  10/26/2015 22:50:15  10/26/2015 22:51:47          33,5
1  c0:ee:fb:35:ec:cd  10/26/2015 22:50:50  10/26/2015 22:50:51          32,5
2  c0:ee:fb:35:ec:cd  10/26/2015 22:52:38  10/26/2015 22:52:38            38
3  30:8c:fb:a4:b9:8b  10/26/2015 22:54:46  10/26/2015 22:54:46            34

非常感谢你的帮助。

2 个答案:

答案 0 :(得分:1)

这有一个老技巧!
主要基于Window functions的力量 完美适用于BigQuery!

所以,首先你"标记"上次入场后超过60秒的所有参赛作品!
那些超过获得价值1并且休息获得价值0的人!

其次,您通过对所有先前的标记求和来定义组(当然,上述步骤是在按设备分区时完成的)

最后,您只需通过上面定义的组进行简单分组

在一个查询中实现了三个简单的步骤,几个简单的子选择! 希望这有帮助

SELECT device, MIN(ts) AS first_seen, MAX(ts) AS last_seen, AVG(value) AS average_value
FROM (
  SELECT device, ts, value, SUM(grp_start) OVER (PARTITION BY device ORDER BY ts) AS grp
  FROM (
   SELECT device, ts, value, 
   IF(TIMESTAMP_TO_SEC(TIMESTAMP(ts))-TIMESTAMP_TO_SEC(TIMESTAMP(ts0))>60,1,0) AS grp_start
   FROM (
      SELECT device, ts, value, LAG(ts, 1) OVER(PARTITION BY device ORDER BY ts) AS ts0
      FROM yourTable
    )
  )
)
GROUP BY device, grp

答案 1 :(得分:0)

这是单程......

DROP TABLE IF EXISTS my_table;

CREATE TABLE my_table
(device CHAR(1) NOT NULL
,timestamp DATETIME NOT NULL
,value INT NOT NULL
,PRIMARY KEY(device,timestamp)
);

INSERT INTO my_table VALUES
('a','2015/10/26 22:50:15',34),
('a','2015/10/26 22:50:46',34),
('b','2015/10/26 22:50:50',33),
('b','2015/10/26 22:50:51',32),
('a','2015/10/26 22:51:15',34),
('a','2015/10/26 22:51:47',32),
('b','2015/10/26 22:52:38',38),
('a','2015/10/26 22:54:46',34);



SELECT m.*
     , AVG(n.value) avg
  FROM 
     ( SELECT a.device
            , a.timestamp start
            , MIN(c.timestamp) end 
         FROM 
            ( SELECT x.*
                   , CASE WHEN x.device = @prev THEN @i:=@i+1 ELSE @i:=1 END i
                   , @prev:=device 
                FROM my_table x
                   , (SELECT @i:=1,@prev:=null) vars 
               ORDER 
                  BY device
                   , timestamp
            ) a
         LEFT 
         JOIN 
            ( SELECT x.*
                   , CASE WHEN x.device = @prev THEN @i:=@i+1 ELSE @i:=1 END i
                   , @prev:=device 
                FROM my_table x
                   , (SELECT @i:=1,@prev:=null) vars 
               ORDER 
                  BY device
                   , timestamp
            ) b
           ON b.device = a.device
          AND b.timestamp > a.timestamp - INTERVAL 60 SECOND
          AND b.i = a.i - 1
         LEFT 
         JOIN 
            ( SELECT x.*
                   , CASE WHEN x.device = @prev THEN @i:=@i+1 ELSE @i:=1 END i
                   , @prev:=device 
                FROM my_table x
                   , (SELECT @i:=1,@prev:=null) vars 
               ORDER 
                  BY device
                   , timestamp
            ) c
           ON c.device = a.device
          AND c.i >= a.i
         LEFT 
         JOIN 
            ( SELECT x.*
                   , CASE WHEN x.device = @prev THEN @i:=@i+1 ELSE @i:=1 END i
                   , @prev:=device 
                FROM my_table x
                   , (SELECT @i:=1,@prev:=null) vars 
               ORDER 
                  BY device
                   , timestamp
            ) d 
           ON d.device = c.device
          AND d.i = c.i + 1
          AND d.timestamp < c.timestamp + INTERVAL 60 SECOND
        WHERE b.i IS NULL 
          AND c.i IS NOT NULL
          AND d.i IS NULL
        GROUP 
           BY a.device
            , a.i
            ) m
         JOIN my_table n
           ON n.device = m.device
          AND n.timestamp BETWEEN start AND end
        GROUP
           BY m.device 
            , m.start;

+--------+---------------------+---------------------+---------+
| device | start               | end                 | avg     |
+--------+---------------------+---------------------+---------+
| a      | 2015-10-26 22:50:15 | 2015-10-26 22:51:47 | 33.5000 |
| a      | 2015-10-26 22:54:46 | 2015-10-26 22:54:46 | 34.0000 |
| b      | 2015-10-26 22:50:50 | 2015-10-26 22:50:51 | 32.5000 |
| b      | 2015-10-26 22:52:38 | 2015-10-26 22:52:38 | 38.0000 |
+--------+---------------------+---------------------+---------+