我的桌子:
hourly_weather electrical_readings
---------------- -----------------------
meter | time_read | temp meter | time | kwh
---------------- -----------------------
1 1316044800 55 1 1316136250 19.24
1 1316138400 56 1 1316044320 18.29
(...) (...)
我想从这些数据中检索两个重要的值:
1)我想要一天的总KW
2)我想要那天的最高温度
我正在使用的查询花了很长时间才能运行,但我想不出另一种方法。比如,两个表中的100,000行数据需要几个小时。
SELECT * FROM (
SELECT * , SUM(kwh) AS sumkwh,
DATE( FROM_UNIXTIME( r.time_read ) ) AS datex,
UNIX_TIMESTAMP( DATE( FROM_UNIXTIME( r.time_read ) ) ) AS datey,
(
SELECT MAX( temp )
FROM hourly_weather hw
WHERE hw.meter = 1
AND time_read >= datey
AND time_read < datey + 86400
) AS temp
FROM electrical_readings r
WHERE id = 1
GROUP BY datex
) as t1
WHERE t1.temp != '';
答案 0 :(得分:2)
我认为在单独的查询中计算这两个值然后加入结果数据集会更简单。您甚至可以定义临时变量和表以简化操作:
# Temp variables for the dates
set @t0 = cast('2013-02-01' as date);
set @t1 = cast('2013-02-02' as date);
# Temporary table 1: Sum of KWH
create temporary table temp_sum_kw
select
date(from_unixtime(timeread)) as `date`, sum(KWH) as sum_kwh
from
electrical_readings
where
timeread >= unix_timestamp(@t0) and timeread < unix_timestamp(date_add(@t1, interval +1 day))
group by
date(from_unixtime(timeread));
alter table temp_sum_kw
add index idx_date(`date`);
# Temporary table 2: Max temp
create temporary table temp_max_temperature
select
date(from_unixtime(timeread)) as `date`, max(temp) as max_temp
from
hourly_weather
where
(timeread >= @t0 and timeread < date_add(@t1, interval +1 day))
and meter = 1
group by
date(from_unixtime(timeread));
alter table temp_max_temperature
add index idx_date(`date`);
# Put it all together
select
m.*, t.max_temp
from
temp_sum_kw as m
inner join temp_max_temperature as t on m.`date` = t.`date`;
使用where
条件timeread >= @t0 and timeread < date_add(@t1, interval +1 day)
的原因是要包括在@t1
的最后一刻之前发生的所有事情。
希望这有助于你
答案 1 :(得分:2)
SELECT DATE(FROM_UNIXTIME(r.time_read)) AS datex,
SUM(r.kwh) AS sumkwh, MAX(hw.temp) AS temp
FROM electrical_readings r
LEFT OUTER JOIN hourly_weather hw
ON DATE(FROM_UNIXTIME(r.time_read)) = DATE(FROM_UNIXTIME(hw.time_read))
AND hw.meter = 1
WHERE r.id = 1
GROUP BY datex
HAVING temp IS NOT NULL
这仍然是性能问题,因为它使用连接的表达式。因此,它必须读取两个表的每个行,以便在它可以判断是否满足连接之前评估表达式。
因此,如果您可以在两个表中为日期添加额外的列(没有时间)并将这些列编入索引,那将会好得多。
ALTER TABLE electrical_readings ADD COLUMN date_read DATE, ADD KEY (date_read);
UPDATE electrical_readings SET date_read = DATE(FROM_UNIXTIME(time_read));
ALTER TABLE hourly_weather ADD COLUMN date_read DATE, ADD KEY (date_read);
UPDATE hourly_weather SET date_read = DATE(FROM_UNIXTIME(time_read));
SELECT r.date_read,
SUM(r.kwh) AS sumkwh, MAX(hw.temp) AS temp
FROM electrical_readings r
LEFT OUTER JOIN hourly_weather hw
ON r.date_read = hw.date_read
AND hw.meter = 1
WHERE r.id = 1
GROUP BY r.date_read
HAVING temp IS NOT NULL
在任何情况下,将SELECT *
添加到其中任何一个查询都不是一个好主意,因为结果将是任意的。
重新评论,抱歉,总和乘以hourly_weather中的匹配行数。
我们可以通过在派生表子查询中执行hourly_weather的聚合来补偿。
SELECT r.date_read,
SUM(r.kwh) AS sumkwh, hw.temp
FROM electrical_readings r
LEFT OUTER JOIN (
SELECT date_read, MAX(temp) AS temp
FROM hourly_weather
WHERE meter = 1
GROUP BY date_read) AS hw
ON r.date_read = hw.date_read
WHERE r.id = 1
GROUP BY r.date_read
HAVING temp IS NOT NULL
最好在hourly_weather上创建一个索引:
ALTER TABLE hourly_weather ADD KEY (date_read, meter, temp);