在SQL中查找给定日期的最高温度

时间:2013-04-15 16:51:37

标签: php mysql greatest-n-per-group

我的桌子:

hourly_weather                 electrical_readings
----------------               -----------------------
meter | time_read | temp       meter | time      | kwh
----------------               -----------------------
1       1316044800  55         1       1316136250  19.24
1       1316138400  56         1       1316044320  18.29
(...)                          (...)

我想从这些数据中检索两个重要的值:

1)我想要一天的总KW

2)我想要那天的最高温度

我正在使用的查询花了很长时间才能运行,但我想不出另一种方法。比如,两个表中的100,000行数据需要几个小时。

SELECT * FROM (
SELECT * , SUM(kwh) AS sumkwh, 
           DATE( FROM_UNIXTIME( r.time_read ) ) AS datex, 
           UNIX_TIMESTAMP( DATE( FROM_UNIXTIME( r.time_read ) ) ) AS datey, 
           (
               SELECT MAX( temp )
               FROM hourly_weather hw
               WHERE hw.meter = 1
                 AND time_read >= datey
                 AND time_read < datey + 86400
           ) AS temp
FROM electrical_readings r
WHERE id = 1
GROUP BY datex
) as t1
WHERE t1.temp != '';

2 个答案:

答案 0 :(得分:2)

我认为在单独的查询中计算这两个值然后加入结果数据集会更简单。您甚至可以定义临时变量和表以简化操作:

# Temp variables for the dates
set @t0 = cast('2013-02-01' as date);
set @t1 = cast('2013-02-02' as date);

# Temporary table 1: Sum of KWH
create temporary table temp_sum_kw
    select 
        date(from_unixtime(timeread)) as `date`, sum(KWH) as sum_kwh
    from 
        electrical_readings
    where 
        timeread >= unix_timestamp(@t0) and timeread < unix_timestamp(date_add(@t1, interval +1 day))
    group by 
        date(from_unixtime(timeread));
alter table temp_sum_kw
    add index idx_date(`date`);

# Temporary table 2: Max temp
create temporary table temp_max_temperature
    select 
        date(from_unixtime(timeread)) as `date`, max(temp) as max_temp
    from 
        hourly_weather
    where 
        (timeread >= @t0 and timeread < date_add(@t1, interval +1 day))
        and meter = 1
    group by 
        date(from_unixtime(timeread));
alter table temp_max_temperature
    add index idx_date(`date`);

# Put it all together
select 
    m.*, t.max_temp
from
    temp_sum_kw as m
    inner join temp_max_temperature as t on m.`date` = t.`date`;

使用where条件timeread >= @t0 and timeread < date_add(@t1, interval +1 day)的原因是要包括在@t1的最后一刻之前发生的所有事情。

希望这有助于你

答案 1 :(得分:2)

SELECT DATE(FROM_UNIXTIME(r.time_read)) AS datex, 
  SUM(r.kwh) AS sumkwh, MAX(hw.temp) AS temp
FROM electrical_readings r
LEFT OUTER JOIN hourly_weather hw
  ON DATE(FROM_UNIXTIME(r.time_read)) = DATE(FROM_UNIXTIME(hw.time_read)) 
  AND hw.meter = 1
WHERE r.id = 1
GROUP BY datex
HAVING temp IS NOT NULL

这仍然是性能问题,因为它使用连接的表达式。因此,它必须读取两个表的每个行,以便在它可以判断是否满足连接之前评估表达式。

因此,如果您可以在两个表中为日期添加额外的列(没有时间)并将这些列编入索引,那将会好得多。

ALTER TABLE electrical_readings ADD COLUMN date_read DATE, ADD KEY (date_read);
UPDATE electrical_readings SET date_read = DATE(FROM_UNIXTIME(time_read));

ALTER TABLE hourly_weather ADD COLUMN date_read DATE, ADD KEY (date_read);
UPDATE hourly_weather SET date_read = DATE(FROM_UNIXTIME(time_read));

SELECT r.date_read, 
  SUM(r.kwh) AS sumkwh, MAX(hw.temp) AS temp
FROM electrical_readings r
LEFT OUTER JOIN hourly_weather hw
  ON r.date_read = hw.date_read 
  AND hw.meter = 1
WHERE r.id = 1
GROUP BY r.date_read
HAVING temp IS NOT NULL

在任何情况下,将SELECT *添加到其中任何一个查询都不是一个好主意,因为结果将是任意的。


重新评论,抱歉,总和乘以hourly_weather中的匹配行数。

我们可以通过在派生表子查询中执行hourly_weather的聚合来补偿。

SELECT r.date_read, 
  SUM(r.kwh) AS sumkwh, hw.temp
FROM electrical_readings r
LEFT OUTER JOIN (
  SELECT date_read, MAX(temp) AS temp
  FROM hourly_weather
  WHERE meter = 1
  GROUP BY date_read) AS hw
    ON r.date_read = hw.date_read 
WHERE r.id = 1
GROUP BY r.date_read
HAVING temp IS NOT NULL

最好在hourly_weather上创建一个索引:

ALTER TABLE hourly_weather ADD KEY (date_read, meter, temp);