我有2个表,其中一个存储项目,另一个存储喜欢。
存储喜欢的表称为video_liked,有2列,video_id和user_id,有2个索引 - video_id-user_id(UNIQUE)和user_id-video_id(PRIMARY)。
另一个表称为视频,具有主索引和自动增量列ID。
我正在尝试获取喜欢观众正在观看的用户喜欢的项目列表,按照喜欢它们的人数排序,最少2个喜欢。
我正在使用的查询是
SELECT vid . * , count( video_liked1.user_id ) AS PersonCount
FROM video AS vid, video_liked, video_liked AS video_liked1
WHERE video_liked.user_id = video_liked1.user_id
AND video_liked.video_id <> video_liked1.video_id
AND video_liked1.video_id = 'ITEM_ID'
AND vid.id = video_liked.video_id
GROUP BY video_liked.video_id
HAVING count( video_liked1.user_id ) >2
ORDER BY PersonCount DESC
LIMIT 12
当有很多喜欢时,查询很慢,所以我将它缩减为最基本的结构
SELECT vid. *
FROM video AS vid, video_liked, video_liked AS video_liked1
WHERE video_liked.user_id = video_liked1.user_id
AND video_liked.video_id <> video_liked1.video_id
AND video_liked1.video_id = 'ITEM_ID'
AND vid.id = video_liked.video_id
GROUP BY video_liked.video_id
LIMIT 12
它的速度稍微快一点,但仍然需要0.05秒才能在28k行的喜欢桌子上执行
EXPLAIN为我提供的输出太宽,以至于没有自动换行,所以这里有一个指向pastebin的链接
http://pastebin.com/raw.php?i=6edwdniQ
我的表也在pastebin中
http://pastebin.com/raw.php?i=jwK1QucA
编辑:
按建议更改了查询
SELECT vid . *, count( v1.user_id ) AS PersonCount
FROM video AS vid
JOIN video_liked AS v1 ON vid.id = v1.video_id
JOIN video_liked AS v2 ON v2.video_id = 'ITEM_ID'
AND v1.user_id = v2.user_id
AND v1.video_id <> v2.video_id
GROUP BY v1.video_id
ORDER BY PersonCount DESC
LIMIT 12
缓慢的罪魁祸首似乎是使用GROUP BY,它创建临时表。
答案 0 :(得分:1)
从查询中删除CROSS JOIN
。那些膨胀你的数据集。
SELECT vid. *
FROM video AS vid
JOIN video_liked AS v1 ON vid.video_id = v1.video_id
JOIN video_liked AS v2 ON v2.video_id = 'ITEM_ID' AND v1.user_id=v2.user_id AND v1.video_id <> v2.video_id
GROUP BY video_liked.video_id
LIMIT 12
答案 1 :(得分:1)
除了删除交叉连接之外,我还会在SELECT子句中明确定义所需的列,即使您需要所有列。
这个DB在哪个平台上?您在视频桌上还有哪些其他索引?