获取下一个记录MySQL Large数据集的时间差异

时间:2015-09-07 19:45:39

标签: mysql

我在使用以下相当简单的查询时遇到了一些麻烦。

示例数据:

+-------+------+-------+---------------------+
| f_rec | f_id | REF   | ORI_TIME            |
+-------+------+-------+---------------------+
|     1 |    1 | 20784 | 1899-12-30 11:03:18 |
|     2 |    1 | 20785 | 1899-12-30 11:03:27 |
|     3 |    1 | 20786 | 1899-12-30 11:03:28 |
|     4 |    1 | 20787 | 1899-12-30 11:03:38 |
|     5 |    1 | 20788 | 1899-12-30 11:03:45 |
|     6 |    1 | 20789 | 1899-12-30 11:03:52 |
|     7 |    2 | 50790 | 1899-12-30 11:04:09 |
|     8 |    2 | 50791 | 1899-12-30 11:04:26 |
|     9 |    2 | 50792 | 1899-12-30 11:05:27 |
|    10 |    2 | 50793 | 1899-12-30 11:05:38 |
+-------+------+-------+---------------------+

查询:

 SELECT IDfocCurr.f_rec, IDfocCurr.f_id,
   TIMESTAMPDIFF(SECOND, IDfocCurr.ORI_TIME, 
        (SELECT IDfocNext.ORI_TIME 
        FROM IDfocals1999-2004 IDfocNext 
        WHERE IDfocNext.REF = IDfocCurr.REF + 1 
            AND IDfocCurr.f_id = IDfocNext.f_id )) as DURATION_NEW,
       FROM IDfocals1999-2004 IDfocCurr

期望的结果:

+-------+------+-------+---------------------+-------------+
| f_rec | f_id | REF   | ORI_TIME            |DURATION_NEW |
+-------+------+-------+---------------------+-------------+
|     1 |    1 | 20784 | 1899-12-30 11:03:18 |            9|
|     2 |    1 | 20785 | 1899-12-30 11:03:27 |            1|
|     3 |    1 | 20786 | 1899-12-30 11:03:28 |           10|
|     4 |    1 | 20787 | 1899-12-30 11:03:38 |            7|
|     5 |    1 | 20788 | 1899-12-30 11:03:45 |            7|
|     6 |    1 | 20789 | 1899-12-30 11:03:52 |         NULL|
|     7 |    2 | 50790 | 1899-12-30 11:04:09 |           17|
|     8 |    2 | 50791 | 1899-12-30 11:04:26 |           61|
|     9 |    2 | 50792 | 1899-12-30 11:05:27 |           11|
|    10 |    2 | 50793 | 1899-12-30 11:05:38 |         NULL|
+-------+------+-------+---------------------+-------------+

其中f_rec是主键,f_id是会话ID。

我想创建一个表,其中下一条记录与当前之间的时间差以秒为单位,表中总共有> 500 000条记录,但MySQL服务器(DigitalOcean Droplet规模扩大到16GB)挂起查询。当我自己运行SELEct查询时,它会向我显示结果。

问题1:一旦我在代码之前插入INTO或CREATE TABLE,服务器就会挂起。因此,我没有选择导出此表,这非常令人沮丧。我做错了什么?

问题2:我还研究过用户变量和JOINS,我很乐意应用用户变量,但无法获得正确的逻辑,JOINS给我类似的结果。我使用了thisthis等示例。我该怎么做?

我做错了什么,或者我该如何优化呢?

2 个答案:

答案 0 :(得分:0)

如果您的表有500,000条记录,则此查询将为每条记录运行内部选择,因此可能组合250,000,000,000(2500亿读取)。难怪它挂断了。

尝试设置一个变量然后用它来减去,比如这个

set @lasttime := date(0);
set @lastref := -20;
set @lastfrec := 0;
set @lastfid := 0;
select @lastref as this_ref, @lastfrec as this_frec, @lastfid as this_f_id,
       if (ref - @lastref == 1, timestampdiff(second, @lasttime, IDfocCurr.ori_time, null) as duration_new,
       @lasttime := IDfocCurr.ori_time as this_ori_time,
       @lastfrec := IDfocCurr.f_rec as next_frec,
       @lastfid := IDfocCurr.f_id as next_f_id,
       @lastref := ref as next_ref
from IDfocals1999-2004 IDfocCurr
order by IDfocCurr.f_rec

或试试这个......

set @lasttime := null;
set @lastref := -20;
select * from (
select f_rec, f_id, REF, ORI_TIME, if(@lastref - REF = 1, timestampdiff(second, ORI_TIME, @lasttime), null) as duration_new,
       @lasttime := ORI_TIME as next_ori_time,
       @lastref := ref as next_ref
from IDfocals19992004
order by REF DESC) revorder order by REF asc

答案 1 :(得分:0)

在您的查询中,您正在为找到的每一行执行嵌套选择,这将花费很长时间。你在执行select时找到足够快的查询的原因是它只会返回前几行。

尝试改为正常(自我)加入。

select id1.f_rec, 
       id1.f_id,
       id1.ref,
       timestampdiff(second, id1.ori_time, id2.ori_time)
from IDfocals id1
join IDfocals id2 on (id1.f_id = id2.f_id + 1 and id1.f_id = id2.f_id)