我一直在努力解决以下问题(以及其他类似的问题),我觉得我错过了一些东西,或者我使用了错误类型的数据库或其他东西。
该查询用于获取新电影的总数以及过去10年中英国与特定城镇每年停止显示(关闭)的电影总数。多年来,对很多城镇和县也进行了查询。
其他查询也会执行类似的操作,有时会在查询结尾处添加UNION ALL
,以获取开启或关闭的记录年份。
还有针对月度数据和季度数据而不是年度数据运行的查询,有些只是比较特定季度(例如Q3)或月份(例如3月)的历史开启/关闭。
以下是2012年将英国与伦敦进行比较的查询:
SELECT inc.opening_year as year, inc.number_of_films as opens,
diss.number_of_films as closures, inc.uk_films as uk_opens,
diss.uk_films as uk_closures
FROM
(SELECT film_db.opening_year, uk.number_of_films as uk_films,
COUNT(film_db.id_film_db) as number_of_films
FROM film_db
JOIN postcodes ON id_postcodes = opening_postcode_id
JOIN towns ON id_towns = town_id AND town = 'London'
JOIN (SELECT opening_year, COUNT(film_db.id_film_db) as number_of_films
FROM film_db
WHERE opening_year <= 2012 AND opening_year >= (2012 - 10)
GROUP BY opening_year
) uk ON uk.opening_year = film_db.opening_year
WHERE film_db.opening_year <= 2012 AND film_db.opening_year >= (2012 - 10)
GROUP BY film_db.opening_year
ORDER BY film_db.opening_year DESC
) inc
JOIN
(SELECT film_db.closing_year, uk.number_of_films as uk_films,
COUNT(film_db.id_film_db) as number_of_films
FROM film_db
JOIN postcodes ON id_postcodes = postcode_id
JOIN towns ON id_towns = town_id AND town = 'London'
JOIN (SELECT closing_year, COUNT(film_db.id_film_db) as number_of_films
FROM film_db
WHERE film_db.closing_year <= 2012 AND film_db.closing_year >= (2012 - 10)
GROUP BY film_db.closing_year
) uk ON uk.closing_year = film_db.closing_year
WHERE film_db.closing_year <= 2012 AND film_db.closing_year >= (2012 - 10)
GROUP BY film_db.closing_year
ORDER BY film_db.closing_year DESC
) diss ON diss.closing_year = inc.opening_year
db SHOW CREATE TABLE
输出如下:
film_db:
CREATE TABLE `film_db` (
`id_film_db` int(11) NOT NULL AUTO_INCREMENT,
`film_name` varchar(255) DEFAULT NULL,
`category` varchar(100) DEFAULT NULL,
`status` varchar(50) DEFAULT NULL,
`opening_date` date DEFAULT NULL,
`opening_year` int(4) DEFAULT NULL,
`opening_month` int(2) DEFAULT NULL,
`opening_quarter` int(1) DEFAULT NULL,
`closing_date` date DEFAULT NULL,
`closing_year` int(4) DEFAULT NULL,
`closing_month` int(2) DEFAULT NULL,
`closing_quarter` int(1) DEFAULT NULL,
`datetime` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
`postcode_id` int(4) NOT NULL DEFAULT '0',
`opening_postcode_id` int(4) NOT NULL DEFAULT '0',
PRIMARY KEY (`id_film_db`),
KEY `opening_date` (`opening_date`),
KEY `status` (`status`),
KEY `postcode_id` (`postcode_id`),
KEY `type` (`category`),
KEY `opening_year` (`opening_year`),
KEY `opening_month` (`opening_month`,`opening_year`) USING BTREE,
KEY `opening_quarter` (`opening_quarter`,`opening_year`) USING BTREE,
KEY `closing_year` (`closing_year`),
KEY `closing_month` (`closing_year`,`closing_month`),
KEY `closing_quarter` (`closing_year`,`closing_quarter`),
KEY `closing_date` (`closing_date`),
KEY `opening_closing_date` (`opening_date`,`closing_date`),
KEY `opening_postcode` (`opening_postcode_id`),
FULLTEXT KEY `film_name` (`film_name`)
) ENGINE=MyISAM AUTO_INCREMENT=10649173 DEFAULT CHARSET=utf8
邮政编码:
CREATE TABLE `postcodes` (
`id_postcodes` int(4) NOT NULL AUTO_INCREMENT,
`postcode` varchar(9) NOT NULL,
`town_id` int(4) NOT NULL,
`lat` float NOT NULL,
`lng` float NOT NULL,
PRIMARY KEY (`id_postcodes`),
UNIQUE KEY `postcode` (`postcode`) USING BTREE,
KEY `town` (`town_id`)
) ENGINE=MyISAM AUTO_INCREMENT=5705 DEFAULT CHARSET=latin1
城镇:
CREATE TABLE `towns` (
`id_towns` int(4) NOT NULL AUTO_INCREMENT,
`town` varchar(150) NOT NULL,
`county_id` int(3) NOT NULL,
PRIMARY KEY (`id_towns`),
KEY `county` (`county_id`)
) ENGINE=MyISAM AUTO_INCREMENT=1606 DEFAULT CHARSET=latin1
这是EXPLAIN EXTENDED
输出:
1 PRIMARY <derived2> ALL 11 100
1 PRIMARY <derived4> ALL 11 100 Using where; Using join buffer
4 DERIVED <derived5> ALL 11 100 Using where; Using temporary; Using filesort
4 DERIVED film_db ref postcode_id,closing_year,closing_month,closing_quarter closing_year 5 uk.closing_year 2 100 Using where
4 DERIVED postcodes eq_ref PRIMARY,town PRIMARY 4 film_db.postcode_id 1 100
4 DERIVED towns eq_ref PRIMARY PRIMARY 4 postcodes.town_id 1 100 Using where
5 DERIVED film_db ALL closing_year,closing_month,closing_quarter 9895680 47.66 Using where; Using temporary; Using filesort
2 DERIVED <derived3> ALL 11 100 Using where; Using temporary; Using filesort
2 DERIVED film_db ref opening_year,opening_postcode opening_year 5 uk.opening_year 3 100 Using where
2 DERIVED postcodes eq_ref PRIMARY,town PRIMARY 4 film_db.opening_postcode_id 1 100
2 DERIVED towns eq_ref PRIMARY PRIMARY 4 postcodes.town_id 1 100 Using where
3 DERIVED film_db ALL opening_year 9895680 54.53 Using where; Using temporary; Using filesort
正如您所看到的,MySQL并不认为film_db
表上的过滤会产生任何性能差异,因此它不会使用任何密钥。
所以:
我可以改进此查询以更好地使用索引吗?
我可以改进索引以便查询运行得更快吗?
我是否应该使用另一种数据库类型(而不是MySQL)代替这种查询,我最感兴趣的是计算具有复杂条件和连接的条目数?
答案 0 :(得分:1)
这是我要尝试的第一件事:
CREATE TABLE opens
SELECT opening_year, COUNT(film_db.id_film_db) as number_of_films
FROM film_db
WHERE opening_year <= 2012 AND opening_year >= (2012 - 10)
GROUP BY opening_year
CREATE TABLE closures
SELECT closing_year, COUNT(film_db.id_film_db) as number_of_films
FROM film_db
WHERE film_db.closing_year <= 2012 AND film_db.closing_year >= (2012 - 10)
GROUP BY film_db.closing_year
我会使用这两个表而不是你现在使用的子选择。
其他查询执行类似的操作,有时会在查询结尾处添加一个UNION ALL,以获取开启或关闭的记录年份。 还有一些查询针对月度数据和季度数据而不是年度数据运行,有些只是比较特定季度(例如Q3)或月份(例如3月)的历史开启/关闭。
我认为你更频繁地运行这些选择,然后打开/关闭表的内容会改变。因此,每次运行此类查询时都不必重建这些表。
我可以改进此查询以更好地使用索引吗? 我可以改进索引以便查询运行得更快吗? 是否有其他数据库类型(不是MySQL),我应该使用它来代替这种查询,我最感兴趣的是计算具有复杂条件和连接的条目数?
当然还有许多其他可能的改进。当然应该有一种让MySQL使用索引的方法。您应该注意,db引擎无法组合单独的索引,也就是说,在这种情况下,opening_postcode_id
上的索引和opening_year
上的索引无法组合。我无法理解为什么它们都没有被使用,但我可以肯定地说这两个索引会改进这个查询
KEY `opening_year_postcode` (`opening_year`, `opening_postcode_id`)
KEY `closing_year_postcode` (`closing_year`, `postcode_id`)
请参阅此SO回答https://stackoverflow.com/a/6295744/176569
多年来我学到的东西,这种性能调整是一个渐进的过程。你将不得不尝试更多的技巧,评估性能提升,最后你将只应用一两个。
此时我不会考虑将MySQL删除给其他数据库供应商。您的性能问题的原因可能不是MySQL。