这是我的SQL查询,大约需要3-4秒。使用Yii2。
SELECT `hotel`.* FROM `hotel`
INNER JOIN `term` ON term.hotel_ID=hotel.ID
INNER JOIN `airport_term` ON airport_term.term_ID=term.ID
INNER JOIN `airport` ON airport.ID=airport_term.airport_ID
WHERE `airport`.`name` IN ('Vienna', 'Berlin', 'Prague')
GROUP BY `hotel`.`ID`
ORDER BY `rating` DESC
解释性查询:https://pastebin.com/niEqrM5M
显示创建表:https://pastebin.com/Ws6yH3P5
基本上,我想完成的是:选择带有维也纳机场条款的酒店
酒店:12k条记录,期限:290k条记录,airport_term:200k条,机场:30条
是否有某种方法可以使此查询更快?我已经在那些表上做了索引。
答案 0 :(得分:0)
我能够使用子查询而不是联接将时间缩短1/2。运行查询需要1-2秒。不理想,但绝对可以进步。我仍然需要在子查询中加入酒店以进行一些过滤,但这仍然更快。
不是专家,但我认为我不必将每个酒店都加入每个学期,而是先过滤条件,然后选择合适的酒店,这对我有帮助。
SELECT `hotel`.* FROM `hotel`
INNER JOIN (
SELECT `term`.`hotel_ID` FROM `term`
INNER JOIN `airport_term` ON airport_term.term_ID=term.ID
INNER JOIN `airport` ON airport.ID=airport_term.airport_ID WHERE `airport`.`name` IN ('Vienna', 'Berlin', 'Prague')
GROUP BY `term`.`hotel_ID`
) `subquery` ON subquery.hotel_ID=hotel.ID ORDER BY `hotel`.`master_rating` DESC
答案 1 :(得分:0)
将问题减少到其基本组成部分...
DROP TABLE IF EXISTS hotel;
CREATE TABLE hotel
(ID SERIAL PRIMARY KEY
,rating float NOT NULL
);
-- populated with 4096 hotels
DROP TABLE IF EXISTS term;
CREATE TABLE term
(ID SERIAL PRIMARY KEY
,hotel_ID int NOT NULL
,KEY (hotel_ID)
);
-- populated with 16384 terms
DROP TABLE IF EXISTS airport;
CREATE TABLE airport
(ID SERIAL PRIMARY KEY
,name varchar(255) NOT NULL UNIQUE
);
-- populated with 50 airports
DROP TABLE IF EXISTS airport_term;
CREATE TABLE airport_term
(term_ID INT NOT NULL
,airport_ID INT NOT NULL
,PRIMARY KEY (term_ID,airport_ID)
);
-- populated with 1403 airport_term pairs
SELECT DISTINCT h.*
FROM hotel h
JOIN term t
ON t.hotel_ID = h.ID
JOIN airport_term ta
ON ta.term_ID = t.ID
JOIN airport a
ON a.ID = ta.airport_ID
WHERE a.name IN ('Vienna', 'Berlin', 'Prague')
ORDER
BY h.ID
, h.rating DESC
-- returns 72 rows in zero seconds, as follows (condensed):
+-----+------------+
| ID | rating |
+-----+------------+
| 45 | 0.0494382 |
| 57 | 0.637326 |
...
| 480 | 0.837546 |
| 481 | 0.860047 |
| 486 | 0.0134837 |
...
| 770 | 0.995263 |
| 787 | 0.590259 |
| 801 | 0.102722 |
| 808 | 0.874417 |
| 813 | 0.217236 |
...
| 885 | 0.405265 |
| 887 | 0.437901 |
| 897 | 0.720929 |
| 901 | 0.84102 |
| 903 | 0.139152 |
| 908 | 0.600746 |
| 909 | 0.502444 |
| 992 | 0.631546 |
+-----+------------+
EXPLAIN
SELECT DISTINCT h.*
FROM hotel h
JOIN term t
ON t.hotel_ID = h.ID
JOIN airport_term ta
ON ta.term_ID = t.ID
JOIN airport a
ON a.ID = ta.airport_ID
WHERE a.name IN ('Vienna', 'Berlin', 'Prague')
ORDER
BY h.ID
, h.rating DESC
+----+-------------+-------+--------+---------------------+---------+---------+--------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------------+---------+---------+--------------------+------+----------------------------------------------+
| 1 | SIMPLE | ta | index | PRIMARY | PRIMARY | 8 | NULL | 1403 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | a | eq_ref | PRIMARY,ID,name | PRIMARY | 8 | test.ta.airport_ID | 1 | Using where |
| 1 | SIMPLE | t | eq_ref | PRIMARY,ID,hotel_ID | PRIMARY | 8 | test.ta.term_ID | 1 | Using where |
| 1 | SIMPLE | h | eq_ref | PRIMARY,ID | PRIMARY | 8 | test.t.hotel_ID | 1 | Using where |
+----+-------------+-------+--------+---------------------+---------+---------+--------------------+------+----------------------------------------------+
答案 2 :(得分:0)
查看查询以及我希望优化器遇到的瓶颈,请尝试添加以下索引。
ALTER TABLE airport_term
ADD INDEX (airport_ID, term_ID)
与您当前的查询一样,它很可能先查找airport
表,获取airport_ID
,然后不得不遍历airport_term
项中的每条记录,因为它无法从term_ID
中快速找到airport_ID
。
通过允许从这20万条记录中的term_ID
中快速查找airport_ID
,该索引应该从根本上改善查询的那部分。
答案 3 :(得分:-1)
我无法从您的表中看到使用的数据类型,因此答案很简单:
您仅从“酒店”表中选择数据,因此:
并且: