限制使用

时间:2018-10-18 09:18:04

标签: mysql performance time-series

我正在尝试使用另一组定义有效时间戳记周期的结果来过滤由时间戳记索引的某些结果。

当前查询:

SELECT Measurements.moment AS "moment",
       Measurements.actualValue,
       start,
       stop
FROM Measurements
       INNER JOIN (SELECT COALESCE(@previousValue <> M.actualValue AND @previousResource = M.resourceId, 1) AS "changed",
                          (COALESCE(@previousMoment, ?)) AS "start",
                          M.moment AS "stop",
                          @previousValue AS "actualValue",
                          M.resourceId,
                          @previousMoment := moment,
                          @previousValue := M.actualValue,
                          @previousResource := M.resourceId
                   FROM Measurements `M`
                          INNER JOIN (SELECT @previousValue := NULL, @previousResource := NULL, @previousMoment := NULL) `d`
                   WHERE (M.moment BETWEEN ? AND ?) AND
                         (M.actualValue > ?)
                   ORDER BY M.resourceId ASC, M.moment ASC) `changes` ON Measurements.moment BETWEEN changes.start AND changes.stop
WHERE (Measurements.resourceId = 1) AND
      (Measurements.moment BETWEEN ? AND ?) AND
      (changes.changed)
ORDER BY Measurements.moment ASC;

resourceId, moment已经是一个索引。 由于这些实际上是时间序列数据,是否有任何方法可以限制仅1个匹配的联接以提高性能?

样本数据

+-------------+---------------------+------------+
| actualValue | moment              | resourceId |
+-------------+---------------------+------------+
|        0.01 | 2018-09-26 07:50:25 |        1   |
|        0.01 | 2018-09-26 07:52:35 |        1   |
|        0.01 | 2018-09-26 07:52:44 |        2   |
|        0.01 | 2018-09-26 07:52:54 |        1   |
|        0.01 | 2018-09-26 07:53:03 |        1   |
|        0.01 | 2018-09-26 07:53:13 |        2   |
|        0.01 | 2018-09-26 07:53:22 |        1   |
|        0.01 | 2018-09-26 07:54:32 |        1   |
|        0.01 | 2018-09-26 07:55:41 |        1   |
|        0.01 | 2018-09-26 07:56:51 |        1   |
+-------------+---------------------+------------+

预期的输出:所有使用resourceId=1进行的测量,其中resourceId=2在同一分钟进行了测量(在高级版本中,该分钟可以是动态的)。

+-------------+---------------------+------------+
| actualValue | moment              | resourceId |
+-------------+---------------------+------------+
|        0.01 | 2018-09-26 07:52:35 |        1   |
|        0.01 | 2018-09-26 07:52:54 |        1   |
|        0.01 | 2018-09-26 07:53:03 |        1   |
|        0.01 | 2018-09-26 07:53:22 |        1   |
+-------------+---------------------+------------+

3 个答案:

答案 0 :(得分:0)

当您使用独立的子查询(这种情况)时,它完全在外部查询之前执行。在您的情况下,这可能是巨大的,并且可能大多数行都不是真正需要的。

如果您使用内部JOIN重新表述查询,则对表的辅助访问将被立即过滤掉,从而避免了对表的全面扫描。

尝试以下查询:

select 
    m.moment,
    m.actualValue,
    c.moment as start,
    timestampadd(minute, 1, c.moment) as stop
  from Measurements m
  join Measurements c on m.moment
    between c.moment and timestampadd(minute, 1, c.moment)
  where m.resourceId = 1
    and c.resourceId = 2
    and m.moment between ? and ?
  order by m.moment

答案 1 :(得分:0)

所需的综合索引:

Measurements:  INDEX(resourceId, moment)  -- in this order

您可能希望在子查询中使用AND (Measurements.moment BETWEEN ? AND ?)

在“派生表”(您拥有的子查询)中,优化器可以随意忽略ORDER BY。但是,如果您添加LIMIT,则ORDER BY将被接受。

答案 2 :(得分:0)

我发现使用表不可透视的解决方案:

SELECT moment, value
FROM (SELECT IF(resourceId = ? AND @previousValue = 0, NULL, actualValue)       AS value,
             measurements.moment,
             resourceId,
             @previousValue := IF(resourceId <> ?, actualValue, @previousValue) AS enabled
      FROM (SELECT *
            FROM (SELECT moment,
                         Measurements.actualValue,
                         Measurements.resourceId AS resourceId
                  FROM Measurements
                  WHERE Measurements.resourceId = ?
                    AND moment BETWEEN ? AND ?
                  UNION (SELECT start,
                                periods.actualValue AS actualValue,
                                resourceId
                         FROM (SELECT COALESCE(@previousValue <> M3.actualValue,                                            1)                                                              AS "changed",
                                      (COALESCE(@previousMoment, ?))                                           AS "start",
                                      @previousMoment := M3.moment                                             AS "stop",
                                      COALESCE(@previousValue, IF(M3.actualValue = 1, 0, 1)) AS "actualValue",
                                      M3.resourceId                                                            AS resourceId,
                                      @previousValue := M3.actualValue
                               FROM Measurements `M3`
                                      INNER JOIN (SELECT @previousValue := NULL,
                                                         @previousMoment := NULL) `d`
                               WHERE (M3.moment BETWEEN ? AND ?)
                               ORDER BY M3.resourceId ASC, M3.moment ASC) AS periods
                         WHERE periods.changed)) AS measurements
            ORDER BY moment ASC) AS measurements
             INNER JOIN (SELECT @previousValue := NULL) `k`) AS mixed
WHERE value IS NOT NULL
  AND resourceId = ?;

这实际上是每个选择运行一次表,在100ms内运行约40k x〜4k行。