所以我有以下问题:
SELECT sensor.id as `sensor_id`,
sensor_reading.id as `reading_id`,
sensor_reading.reading as `reading`,
from_unixtime(sensor_reading.reading_timestamp) as `reading_timestamp`,
sensor_reading.lower_threshold as `lower_threshold`,
sensor_reading.upper_threshold as `upper_threshold`,
sensor_type.units as `unit`
FROM sensor
LEFT JOIN sensor_reading ON sensor_reading.sensor_id = sensor.id
LEFT JOIN sensor_type ON sensor.sensor_type_id = sensor_type.id
WHERE sensor.company_id = 1
GROUP BY sensor_reading.sensor_id
ORDER BY sensor_reading.reading_timestamp DESC
这里有三张桌子。 sensor_type 表,仅用于单个显示字段(单位),传感器表,其中包含有关传感器的信息,以及 sensor_reading table,包含传感器的各个读数。有多个读数适用于单个传感器,因此sensor_reading表中的每个条目都有一个 sensor_id ,它与传感器表中的 ID 字段链接到外部关键约束。
理论上,此查询应返回EACH唯一传感器的最新sensor_reading。相反,它会返回每个传感器的第一个读数。我在这里发现了一些类似问题的帖子,但是还没有能够使用他们的任何答案来解决这个问题。理想情况下,查询需要尽可能高效,因为此表有几千个读数(并且继续增长)。
有谁知道如何更改此查询以返回最近的阅读?如果我删除 GROUP BY 子句,它会返回正确的顺序,但我必须筛选数据以获得每个传感器的最新数据。
理想情况下,我不想运行子查询,因为这会减慢很多事情,速度是这里的一个重要因素。
谢谢!
答案 0 :(得分:3)
理论上,此查询应返回EACH唯一传感器的最新sensor_reading。
这是MySQL Group by extension的一个相当常见的误解,它允许您选择没有聚合的列,这些列不包含在group by子句中。文档说明的是:
服务器可以自由选择每个组中的任何值,因此除非它们相同,否则所选的值是不确定的。此外,添加ORDER BY子句
不会影响每个组中值的选择
因此,由于您按sensor_reading.sensor_id
进行分组,因此对于每个sensor_reading
,MySQL将从sensor_id
选择任意行,然后选择为每个sensor_id
选择一行,然后将排序应用于所选的行。
由于您只需要每个传感器的最新行,因此通常的方法是:
SELECT *
FROM sensor_reading AS sr
WHERE NOT EXISTS
( SELECT 1
FROM sensor_reading AS sr2
WHERE sr2.sensor_id = sr.sensor_id
AND sr2.reading_timestamp > sr.reading_timestamp
);
但是,MySQL will optimise LEFT JOIN/IS NULL
better than NOT EXISTS
所以MySQL特定的解决方案是:
SELECT sr.*
FROM sensor_reading AS sr
LEFT JOIN sensor_reading AS sr2
ON sr2.sensor_id = sr.sensor_id
AND sr2.reading_timestamp > sr.reading_timestamp
WHERE sr2.id IS NULL;
因此,将此结合到您的查询中,您最终会得到:
SELECT sensor.id as `sensor_id`,
sensor_reading.id as `reading_id`,
sensor_reading.reading as `reading`,
from_unixtime(sensor_reading.reading_timestamp) as `reading_timestamp`,
sensor_reading.lower_threshold as `lower_threshold`,
sensor_reading.upper_threshold as `upper_threshold`,
sensor_type.units as `unit`
FROM sensor
LEFT JOIN sensor_reading
ON sensor_reading.sensor_id = sensor.id
LEFT JOIN sensor_type
ON sensor.sensor_type_id = sensor_type.id
LEFT JOIN sensor_reading AS sr2
ON sr2.sensor_id = sensor_reading.sensor_id
AND sr2.reading_timestamp > sensor_reading.reading_timestamp
WHERE sensor.company_id = 1
AND sr2.id IS NULL
ORDER BY sensor_reading.reading_timestamp DESC;
获取每组最大值的另一种方法是将内连接返回到最新一行,如下所示:
SELECT sr.*
FROM sensor_reading AS sr
INNER JOIN
( SELECT sensor_id, MAX(reading_timestamp) AS reading_timestamp
FROM sensor_reading
GROUP BY sensor_id
) AS sr2
ON sr2.sensor_id = sr.sensor_id
AND sr2.reading_timestamp = sr.reading_timestamp;
您可能会发现这比其他方法更有效,或者您可能不会,YMMV。它基本上取决于您的数据和索引,正如您所说,子查询可能是MySQL中的一个问题,因为最初的结果是完整的。