我对MySQL 5.1的查询非常糟糕。 我简化了我加入的2个表:
CREATE TABLE `jobs` (
`id` INT NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`title` VARCHAR( 255 ) NOT NULL
) ENGINE = MYISAM ;
AND
CREATE TABLE `jobsCategories` (
`jobID` int(11) NOT NULL,
`industryID` int(11) NOT NULL,
KEY `jobID` (`jobID`),
KEY `industryID` (`industryID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
查询很简单:
SELECT count(*) as nb,industryID
FROM jobs J
INNER JOIN jobsCategories C ON C.jobID=J.id
GROUP BY industryID
ORDER BY nb DESC;
我在jobs表中获得了大约150000条记录,在jobsCategories表中获得了350000条记录,我有30个行业;
查询大约需要50秒才能执行!!!
你知道为什么需要这么长时间吗?我怎样才能优化这个数据库的结构?查询的粗略显示,99%的执行时间花在复制tmp表上。
EXPLAIN <query> gives me :
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: J
type: index
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: NULL
rows: 178950
Extra: Using index; Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: C
type: ref
possible_keys: jobID
key: jobID
key_len: 8
ref: J.id
rows: 1
Extra: Using where
2 rows in set (0.00 sec)
关于记忆:
free -m :
total used free shared buffers cached
Mem: 2011 1516 494 0 8 1075
-/+ buffers/cache: 433 1578
Swap: 5898 126 5772
使用下面建议的FORCE INDEX
select count(*) as nb, industryID
from
jobs J
inner join jobsCategories C force index (industryID) on (C.jobID = J.id )
group by industryID
order by nb DESC;
SHOW PROFILE;
给了我:
+----------------------+----------+
| Status | Duration |
+----------------------+----------+
| starting | 0.000095 |
| Opening tables | 0.000014 |
| System lock | 0.000008 |
| Table lock | 0.000007 |
| init | 0.000032 |
| optimizing | 0.000011 |
| statistics | 0.000032 |
| preparing | 0.000016 |
| Creating tmp table | 0.000031 |
| executing | 0.000003 |
| Copying to tmp table | 3.301305 |
| Sorting result | 0.000028 |
| Sending data | 0.000024 |
| end | 0.000003 |
| removing tmp table | 0.000009 |
| end | 0.000004 |
| query end | 0.000003 |
| freeing items | 0.000029 |
| logging slow query | 0.000003 |
| cleaning up | 0.000003 |
+----------------------+----------+
我猜我的RAM(2Gb)还不够大。我怎么能确定是这种情况呢?
答案 0 :(得分:4)
首先,我认为您不需要加入表作业以获得相同的结果(除非您在表 jobsCategories 中有一些垃圾数据):< / p>
select count(*) as nb, industryID
from jobsCategories
group by industryID
order by nb DESC;
否则,您可以尝试强制 industryID :
上的索引select count(*) as nb, industryID
from
jobs J
inner join jobsCategories C force index (industryID) on (C.jobID = J.id )
group by industryID
order by nb DESC;
答案 1 :(得分:0)
将您的表格更改为InnoDB =)InnoDB可以很好地管理大表格,而COUNT(*)可以更快地管理
http://www.mysqlperformanceblog.com/2009/01/12/should-you-move-from-myisam-to-innodb/
祝你好运
修改强>
经过测试,当没有COUNT(*)
子句时,使用WHERE
时,MyISAM似乎比InnoDB更快:
http://www.mysqlperformanceblog.com/2006/12/01/count-for-innodb-tables/
无论如何,我已经测试了你使用MyISAM表模拟你拥有的表(150k Jobs和300k JobsCategories)的确切查询,花了1.5秒,所以也许你的问题在别处..这就是我可以告诉你的全部= P < / p>
答案 2 :(得分:0)
希望我不会误解读数,但从我看到的情况来看,你不需要任何加入。由于您的分组是每个行业中有多少个工作,所有这些都在您的工作类别表中,为什么要加入到工作标题的实际工作表中,因为甚至没有返回
select IndustryID,
count(*) JobsPerIndustry
from JobCategories
group by IndustryID
编辑评论/反馈......
这肯定会有所不同...添加与作业相关联的条件...确保您的Jobs表格中包含您希望允许限制的元素的索引...然后按照您最初的类似查询。确保您的Jobs表具有CountryID的索引。
SELECT
count(*) as nb,
industryID
FROM jobs J
JOIN jobsCategories C
ON J.ID = C.jobID
WHERE
J.countryID=1234
GROUP BY
industryID
ORDER BY
nb DESC;