使用volkszaehler.org我需要从一百万+行表中检索数据,下面是ORM创建的内容:
CREATE TABLE `data` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`channel_id` int(11) DEFAULT NULL,
`timestamp` bigint(20) NOT NULL,
`value` double NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `ts_uniq` (`channel_id`,`timestamp`),
KEY `IDX_ADF3F36372F5A1AA` (`channel_id`)
)
现在,选择分组数据很慢,尤其是在Raspberry Pi等低性能平台上运行时:
SELECT MAX(timestamp) AS timestamp, SUM(value) AS value, COUNT(timestamp) AS count
FROM data WHERE channel_id = 4 AND timestamp >= 1356994800000 AND timestamp <= 1375009341000
GROUP BY YEAR(FROM_UNIXTIME(timestamp/1000)), DAYOFYEAR(FROM_UNIXTIME(timestamp/1000));
说明:
SIMPLE data ref ts_uniq,IDX_ADF3F36372F5A1AA ts_uniq 5 const 2066 Using where; Using temporary; Using filesort
查询需要经过50k记录,在Core i5上需要1.5秒,在RasPi上需要6秒。
除了减少数据量之外,还有什么可以提高性能吗?
答案 0 :(得分:1)
增加数据量,而不是减少数量,这就是你所需要的:你在GROUP BY子句中有两个函数,如果是这样,你事先在触发器中计算YEAR(FROM_UNIXTIME(timestamp/1000))
和DAYOFYEAR(FROM_UNIXTIME(timestamp/1000))
并将值存储到在其他字段中,SELECT语句会更快。
除此之外,你可以简单地将timestamp
截断为最接近的日期除以1000 * 3600 * 24 = 86400000并将其分组只有一个字段,因为我无法看到按年分组的点和一年中的一天,当您只能按日期分组时:
SELECT
MAX(timestamp) AS timestamp,
SUM(value) AS value,
COUNT(timestamp) AS count
FROM data WHERE
channel_id = 4 AND
timestamp >= 1356994800000 AND
timestamp <= 1375009341000
GROUP BY timestamp/86400000;
就个人而言,之后我会添加日期字段,对其进行索引并在触发器中更新它,以便我可以从GROUP BY中删除所有算术表达式。在那种情况下,将使用索引。