mysql子查询性能

时间:2011-07-26 04:56:53

标签: mysql query-optimization subquery

我的SQL(带子查询)需要很长时间(接近24小时)。是否使用子查询对性能不利?

我的表格如下

mysql> show create table eventnew;
CREATE TABLE `eventnew` (
  `id` int(50) NOT NULL AUTO_INCREMENT,
  `date` datetime DEFAULT NULL,
  `src_ip` int(10) unsigned DEFAULT NULL,
  `src_port` int(10) unsigned DEFAULT NULL,
  `dst_ip` int(10) unsigned DEFAULT NULL,
  `dst_port` int(10) unsigned DEFAULT NULL,
  `repo_ip` varchar(50) DEFAULT NULL,
  `link` varchar(50) DEFAULT NULL,
  `binary_hash` varchar(50) DEFAULT NULL,
  `sensor_id` varchar(50) DEFAULT NULL,
  `repox_ip` int(10) unsigned DEFAULT NULL,
  `flags` varchar(50) DEFAULT NULL,
  `shellcode` varchar(1000) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `date` (`date`),
  KEY `sensor_id` (`sensor_id`),
  KEY `src_ip` (`src_ip`)
) ENGINE=MyISAM AUTO_INCREMENT=883278 DEFAULT CHARSET=latin1

我的SQL如下:

SELECT COUNT( DISTINCT binary_hash ) AS cnt
FROM eventnew
WHERE DATE >=  '2010-10-16'
AND DATE <  '2010-10-17'
AND binary_hash NOT 
IN (

SELECT DISTINCT binary_hash
FROM eventnew
WHERE DATE <  '2010-10-16'
AND binary_hash IS NOT NULL
)

以下是结果运行EXPLAIN:

+----+--------------------+----------+-------+---------------+------+---------+------+--------+-------------+
| id | select_type        | table    | type  | possible_keys | key  | key_len | ref  | rows   | Extra       |
+----+--------------------+----------+-------+---------------+------+---------+------+--------+-------------+
|  1 | PRIMARY            | eventnew | range | date          | date | 9       | NULL |  14296 | Using where |
|  2 | DEPENDENT SUBQUERY | eventnew | range | date          | date | 9       | NULL | 384974 | Using where |
+----+--------------------+----------+-------+---------------+------+---------+------+--------+-------------+

3 个答案:

答案 0 :(得分:2)

使用子查询肯定会影响您的性能。例如,假设表T1具有'n'记录,T2具有'm'记录。当您在T1和T2上进行连接时,它将需要n * m个记录,然后根据您的条件对它们进行排序。同样的情况也与 关键字一致。如果在子查询中有另一个约束,则会进一步降低效率。但是,在实践中使用子查询是不可避免的。

答案 1 :(得分:0)

我建议您使用NOT EXISTS代替NOT IN

答案 2 :(得分:0)

试试这个

SELECT COUNT( DISTINCT a.binary_hash ) AS cnt
FROM eventnew a left join eventnew b on (a.binary_hash=b.binary_hash AND b.binary_hash IS NOT NULL AND b.DATE <  '2010-10-16')
WHERE a.DATE >=  '2010-10-16'
AND a.DATE <  '2010-10-17'
and  b.date is null