我想在标题&的内容表中搜索请求的字词($ q)。关键字,也适用于模型,它们位于另一个表中,并通过其间的表链接。另外,我需要在另一个表中获取视图的数量。
这是我到目前为止一直在进行的查询,结果很好但是速度太慢(我在PhpMyAdmin中运行它时平均为0.6s ......我们每月有数百万访问者)
SELECT DISTINCT SQL_CALC_FOUND_ROWS
c.*,
cv.views,
(MATCH (c.title) AGAINST ('{$q}') * 3) Relevance1,
MATCH (c.keywords) AGAINST ('{$q}') Relevance2,
(MATCH (a.`name`) AGAINST ('{$q}') * 2) Relevance3
FROM
content AS c
LEFT JOIN
content_actors AS ca ON ca.content = c.record_num
LEFT JOIN
actors AS a ON a.record_num = cm.actor
LEFT JOIN
content_views AS cv ON cv.content = c.record_num
WHERE
c.enabled = 1
GROUP BY c.title, c.length
HAVING (Relevance1 + Relevance2 + Relevance3) > 0
ORDER BY (Relevance1 + Relevance2 + Relevance3) DESC
表架构如下所示:
content
record_num title keywords
1 Video1 Comedy, Action, Supercool
2 Video2 Comet
content_actors
content model
1 1
1 2
2 1
actors
record_num name
1 Jennifer Lopez
2 Bruce Willis
content_views
content views
1 160
2 312
以下是我通过SHOW INDEX FROM tablename:
找到的索引Table Column_Name Seq_in_index Key_name Index_type
---------------------------------------------------------------------------
content record_num 1 PRIMARY BTREE
content keywords 1 keywords FULLTEXT
content keywords 2 title FULLTEXT
content title 1 title FULLTEXT
content description 1 description FULLTEXT
content keywords 1 keywords_2 FULLTEXT
content_actors content 1 content BTREE
content_actors actor 2 content BTREE
content_actor actor 1 actor BTREE
actors record_num 1 PRIMARY BTREE
actors name 1 name BTREE
actors name 1 name_2 FULLTEXT
content_views content 1 PRIMARY BTREE
content_views views 1 views BTREE
以下是查询的EXPLAIN:
ID SELECT_TYPE TABLE TYPE POSSIBLE_KEYS KEY ROWS EXTRA
1 SIMPLE c ref enabled_2, enabled enabled 29210 Using where; Using temporary; Using filesort
1 SIMPLE ca ref content content 1 Using index
1 SIMPLE a eq_ref PRIMARY PRIMARY 1
1 SIMPLE cv eq_ref PRIMARY PRIMARY 1
我正在使用GROUP BY来避免重复内容,但单独使用此组似乎会使处理查询所需的时间加倍。
编辑 在稍微玩了一下查询之后,我意识到如果我删除了GROUP BY,我会得到重复项,如果我让GROUP BY在那里,它就不会采取适当的Relevance3值(没有GROUP BY,一个为Relevance3返回值,而另一个不是......)
答案 0 :(得分:0)
将MATCHes
(或者' d加在一起)添加到WHERE
- 这将显着减少SQL_CALC_FOUND_ROWS
中要处理的行数,并且无需HAVING...
。
而不是
cv.views,
...
LEFT JOIN content_views AS cv ON cv.content = c.record_num
DO
( SELECT views FROM content_views ON content = c.record_num ) AS views,
修改强>
LEFT
和GROUP BY
是必需的,因为actors
是可选的,可能有多个actors
。既然你根本不需要演员姓名,你可以通过
WHERE ... AND ( EXISTS SELECT *
FROM content_actors
JOIN actors AS a ON ...
WHERE MATCH (a.`name`) AGAINST ('{$q}')
AND ca...
)
但是这不允许您在ORDER BY
中包含相关性。
因此,您需要使用UNION DISTINCT
构建子查询。将有2 SELECTs
:
SELECT#1:
SELECT c.id,
3 * MATCH(c.title) AGAINST ('{$q}')
+ MATCH(c.keywords) AGAINST ('{$q}') AS relevance
FROM Content AS c
WHERE MATCH(c.title, c.keywords) AGAINST ('{$q}')
(并且FULLTEXT(title, keywords)) This will efficiently fetch the ids for
内容行有用。
SELECT#2:
SELECT c.id,
2*MAX(MATCH(a.actor) AGAINST ('{$q}') AS actor_rel) AS relevance
FROM content AS c
JOIN content_actors ca ON ca.content = c.record_num
JOIN actors a ON a.record_num = ca.actor
WHERE MATCH(a.actor) AGAINST ('{$q}')
GROUP BY c.id;
请务必拥有content_actors: INDEX(actor)
和content: INDEX(record_num)
。此SELECT
将有效地从actors
开始,然后返回content
。请注意,当两个演员MATCH
时,它会与您的代码有所不同;希望我的MAX
是一个更好的解决方案。
现在,让我们把事情放在一起......
SELECT#3:
SELECT id, SUM(rel) AS relevance
FROM ( ... select #1 ... )
UNION ALL
( ... select #2 ... )
GROUP BY id
但这并不是全部...
SELECT#4:
SELECT c.*,
( ... views ... ) AS views
FROM ( ... select #3 ... ) AS u
JOIN content c ON c.id = u.id
我建议您手动运行这些步骤以验证它们,逐步将所有部分组合在一起。是的,它很复杂,但应该非常快。