我有一个博客,通过在访问者的IP中插入一个条目(views
)来跟踪谁查看帖子和时间,帖子ID(我在这些字段上有一个主键)和一个时间戳。
此表格用于显示我的每个类别(其中有4个)的前5个帖子,包括最后一天/每周/每月/每年和所有时间。所以,总共有20个查询正在执行,每个查询占用0.2到0.7秒...我的页面加载时间超过7秒,这很糟糕。
这里有一些关于我的数据库结构的有用信息:
+---------------------+ +----------------------+
| posts (82 rows) | | views (50k rows) |
+=====================+ +======================+
| id (primary) | | ip (primary) |
+---------------------+ +----------------------+
| type | | article_id (primary) |
+---------------------+ +----------------------+
| thumbnail | | date (index) |
+---------------------+ +----------------------+
| title (index) |
+---------------------+
| url |
+---------------------+
| description (index) |
+---------------------+
| content |
+---------------------+
| date |
+---------------------+
| lastmod |
+---------------------+
| sources |
+---------------------+
| tags |
+---------------------+
| published |
+---------------------+
| ... |
+---------------------+
...
代表我的帖子英文版的其他字段(url_en
,title_en
,description_en
,tags_en
和{{1} })。
这是我的一个大问题(它们基本上都是一样的):
content_en
我发现SELECT p.title, p.id, p.url, tmp.cnt AS views
FROM posts AS p
LEFT JOIN (SELECT COUNT(*) AS cnt, article_id -- 0.34s
FROM views
WHERE article_id IN (SELECT id
FROM posts
WHERE id <> 12 AND type = 'Tutoriel') AND
date BETWEEN 01-01-2013 AND NOW() -- the 01-01-2013 is normally a variable but for testing purposes I've replaced it with a fixed date here
GROUP BY article_id
ORDER BY cnt DESC LIMIT 5) AS tmp
ON p.id = tmp.article_id
WHERE p.id IN (SELECT article_id
FROM (SELECT COUNT(*) AS cnt, article_id -- 0.34s
FROM views
WHERE article_id IN (SELECT id
FROM posts
WHERE id <> 12 AND type = 'Tutoriel')
AND date BETWEEN 01-01-2013 AND NOW()
GROUP BY article_id
ORDER BY cnt DESC LIMIT 5) AS tmp2
)
ORDER BY views DESC
子句占用了大部分时间,因为我对所有帖子的所有时间统计信息都有同样的查询(因此,不依赖于类别,也不依赖于日期)只需要0.33秒执行。
我已经以各种可能的方式查看了这个查询,并且找不到更简单,更优化的方式来编写它...但是,我觉得必须有办法。也许我只是错过了一些明显的东西。
让我烦恼的一件事是我的重复子查询。我没有找到任何其他方法来获取我的帖子数据和相关视图的数量。
我在想的是,当用户点击该期间的标签时,可能会为每个时段执行AJAX请求(这是一个标签视图)。然而,这并没有真正解决问题,它只是一种肮脏的解决方法。
我可以通过以下方式之一对BETWEEN
表进行分区:
posts
,title
,description
)的一个表格,其余字段为如果我没弄错的话,这可以加快一点。
有人能给我一些建议吗?顺便说一句,感谢与我的关系,直到这里:)
答案 0 :(得分:1)
不确定它会有所帮助,但如果BETWEEN
需要花费很多时间,可能会把它变成另一种情况吗?
date BETWEEN 01-01-2013 AND NOW()
到
date > 01-01-2013
所以它不必比较两个日期,它将始终在01-01-2013和现在之间
答案 1 :(得分:1)
较旧版本的MySQL在使用子查询优化in
时特别无能为力。请尝试使用join
:
SELECT p.title, p.id, p.url, tmp.cnt AS views
FROM posts AS p
LEFT JOIN (SELECT COUNT(*) AS cnt, article_id -- 0.34s
FROM views
WHERE article_id IN (SELECT id
FROM posts
WHERE id <> 12 AND type = 'Tutoriel') AND
date BETWEEN 01-01-2013 AND NOW() -- the 01-01-2013 is normally a variable but for testing purposes I've replaced it with a fixed date here
GROUP BY article_id
ORDER BY cnt DESC LIMIT 5) AS tmp
ON p.id = tmp.article_id join
(SELECT COUNT(*) AS cnt, article_id -- 0.34s
FROM views v join
(SELECT id
FROM posts p
WHERE p.id <> 12 AND p.type = 'Tutoriel'
) p
on v.article_id = p.id
WHERE v.date BETWEEN 01-01-2013 AND NOW()
GROUP BY v.article_id
ORDER BY cnt DESC
LIMIT 5
) a
on p.id = a.article_id
ORDER BY views DESC
编辑:
如果我正确理解了查询,您只需将left outer join
更改为join
并完全取消where
条款:
SELECT p.title, p.id, p.url, tmp.cnt AS views
FROM posts Ap JOIN
(SELECT COUNT(*) AS cnt, article_id -- 0.34s
FROM views
WHERE article_id IN (SELECT id
FROM posts
WHERE id <> 12 AND type = 'Tutoriel') AND
date BETWEEN 01-01-2013 AND NOW() -- the 01-01-2013 is normally a variable but for testing purposes I've replaced it with a fixed date here
GROUP BY article_id
ORDER BY cnt DESC
LIMIT 5
) tmp
ON p.id = tmp.article_id;
然后将子查询中的in
更改为连接:
SELECT p.title, p.id, p.url, tmp.cnt AS views
FROM posts Ap JOIN
(SELECT COUNT(*) AS cnt, article_id -- 0.34s
FROM views v join
(SELECT distinct p.id -- distinct may not be necessary
FROM posts p
WHERE p.id <> 12 AND p.type = 'Tutoriel'
) p
on v.rticle_id = p.id
WHERE date BETWEEN 01-01-2013 AND NOW() -- the 01-01-2013 is normally a variable but for testing purposes I've replaced it with a fixed date here
GROUP BY article_id
ORDER BY cnt DESC
LIMIT 5
) tmp
ON p.id = tmp.article_id;