我的SQL(带子查询)需要很长时间(接近24小时)。是否使用子查询对性能不利?
我的表格如下
mysql> show create table eventnew;
CREATE TABLE `eventnew` (
`id` int(50) NOT NULL AUTO_INCREMENT,
`date` datetime DEFAULT NULL,
`src_ip` int(10) unsigned DEFAULT NULL,
`src_port` int(10) unsigned DEFAULT NULL,
`dst_ip` int(10) unsigned DEFAULT NULL,
`dst_port` int(10) unsigned DEFAULT NULL,
`repo_ip` varchar(50) DEFAULT NULL,
`link` varchar(50) DEFAULT NULL,
`binary_hash` varchar(50) DEFAULT NULL,
`sensor_id` varchar(50) DEFAULT NULL,
`repox_ip` int(10) unsigned DEFAULT NULL,
`flags` varchar(50) DEFAULT NULL,
`shellcode` varchar(1000) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `date` (`date`),
KEY `sensor_id` (`sensor_id`),
KEY `src_ip` (`src_ip`)
) ENGINE=MyISAM AUTO_INCREMENT=883278 DEFAULT CHARSET=latin1
我的SQL如下:
SELECT COUNT( DISTINCT binary_hash ) AS cnt
FROM eventnew
WHERE DATE >= '2010-10-16'
AND DATE < '2010-10-17'
AND binary_hash NOT
IN (
SELECT DISTINCT binary_hash
FROM eventnew
WHERE DATE < '2010-10-16'
AND binary_hash IS NOT NULL
)
以下是结果运行EXPLAIN:
+----+--------------------+----------+-------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+----------+-------+---------------+------+---------+------+--------+-------------+
| 1 | PRIMARY | eventnew | range | date | date | 9 | NULL | 14296 | Using where |
| 2 | DEPENDENT SUBQUERY | eventnew | range | date | date | 9 | NULL | 384974 | Using where |
+----+--------------------+----------+-------+---------------+------+---------+------+--------+-------------+
答案 0 :(得分:2)
使用子查询肯定会影响您的性能。例如,假设表T1具有'n'记录,T2具有'm'记录。当您在T1和T2上进行连接时,它将需要n * m个记录,然后根据您的条件对它们进行排序。同样的情况也与 关键字一致。如果在子查询中有另一个约束,则会进一步降低效率。但是,在实践中使用子查询是不可避免的。
答案 1 :(得分:0)
我建议您使用NOT EXISTS
代替NOT IN
。
答案 2 :(得分:0)
试试这个
SELECT COUNT( DISTINCT a.binary_hash ) AS cnt
FROM eventnew a left join eventnew b on (a.binary_hash=b.binary_hash AND b.binary_hash IS NOT NULL AND b.DATE < '2010-10-16')
WHERE a.DATE >= '2010-10-16'
AND a.DATE < '2010-10-17'
and b.date is null