在一个页面上的大量查询执行时间太长...似乎无法优化它们

时间:2013-07-27 15:22:37

标签: mysql sql performance optimization subquery

我有一个博客,通过在访问者的IP中插入一个条目(views)来跟踪谁查看帖子和时间,帖子ID(我在这些字段上有一个主键)和一个时间戳。

此表格用于显示我的每个类别(其中有4个)的前5个帖子,包括最后一天/每周/每月/每年和所有时间。所以,总共有20个查询正在执行,每个查询占用0.2到0.7秒...我的页面加载时间超过7秒,这很糟糕。

这里有一些关于我的数据库结构的有用信息:

+---------------------+        +----------------------+
|   posts (82 rows)   |        |   views (50k rows)   |
+=====================+        +======================+
|    id (primary)     |        |     ip (primary)     |
+---------------------+        +----------------------+
|        type         |        | article_id (primary) |
+---------------------+        +----------------------+
|     thumbnail       |        |     date (index)     |
+---------------------+        +----------------------+
|    title (index)    |       
+---------------------+
|         url         |
+---------------------+
| description (index) |
+---------------------+
|       content       | 
+---------------------+
|        date         |
+---------------------+
|       lastmod       |
+---------------------+
|       sources       |
+---------------------+
|        tags         |
+---------------------+
|      published      |
+---------------------+
|         ...         |
+---------------------+

...代表我的帖子英文版的其他字段(url_entitle_endescription_entags_en和{{1} })。

这是我的一个大问题(它们基本上都是一样的):

content_en

我发现SELECT p.title, p.id, p.url, tmp.cnt AS views FROM posts AS p LEFT JOIN (SELECT COUNT(*) AS cnt, article_id -- 0.34s FROM views WHERE article_id IN (SELECT id FROM posts WHERE id <> 12 AND type = 'Tutoriel') AND date BETWEEN 01-01-2013 AND NOW() -- the 01-01-2013 is normally a variable but for testing purposes I've replaced it with a fixed date here GROUP BY article_id ORDER BY cnt DESC LIMIT 5) AS tmp ON p.id = tmp.article_id WHERE p.id IN (SELECT article_id FROM (SELECT COUNT(*) AS cnt, article_id -- 0.34s FROM views WHERE article_id IN (SELECT id FROM posts WHERE id <> 12 AND type = 'Tutoriel') AND date BETWEEN 01-01-2013 AND NOW() GROUP BY article_id ORDER BY cnt DESC LIMIT 5) AS tmp2 ) ORDER BY views DESC 子句占用了大部分时间,因为我对所有帖子的所有时间统计信息都有同样的查询(因此,不依赖于类别,也不依赖于日期)只需要0.33秒执行。

我已经以各种可能的方式查看了这个查询,并且找不到更简单,更优化的方式来编写它...但是,我觉得必须有办法。也许我只是错过了一些明显的东西。

让我烦恼的一件事是我的重复子查询。我没有找到任何其他方法来获取我的帖子数据和相关视图的数量。

我在想的是,当用户点击该期间的标签时,可能会为每个时段执行AJAX请求(这是一个标签视图)。然而,这并没有真正解决问题,它只是一种肮脏的解决方法。

我可以通过以下方式之一对BETWEEN表进行分区:

  • 法语版的一个表格和英语版的另一个表格
  • 常用字段(poststitledescription)的一个表格,其余字段为
  • 以上
  • 的组合

如果我没弄错的话,这可以加快一点。

有人能给我一些建议吗?顺便说一句,感谢与我的关系,直到这里:)

2 个答案:

答案 0 :(得分:1)

不确定它会有所帮助,但如果BETWEEN需要花费很多时间,可能会把它变成另一种情况吗?

date BETWEEN 01-01-2013 AND NOW()

date > 01-01-2013

所以它不必比较两个日期,它将始终在01-01-2013和现在之间

答案 1 :(得分:1)

较旧版本的MySQL在使用子查询优化in时特别无能为力。请尝试使用join

SELECT p.title, p.id, p.url, tmp.cnt AS views
FROM posts AS p 
LEFT JOIN (SELECT COUNT(*) AS cnt, article_id -- 0.34s
           FROM views
           WHERE article_id IN (SELECT id
                                FROM posts
                                WHERE id <> 12 AND type = 'Tutoriel') AND 
                 date BETWEEN 01-01-2013 AND NOW() -- the 01-01-2013 is normally a variable but for testing purposes I've replaced it with a fixed date here
           GROUP BY article_id
           ORDER BY cnt DESC LIMIT 5) AS tmp 
       ON p.id = tmp.article_id join
          (SELECT COUNT(*) AS cnt, article_id -- 0.34s
           FROM views v join
                (SELECT id
                 FROM posts p
                 WHERE p.id <> 12 AND p.type = 'Tutoriel'
                ) p
                on v.article_id = p.id
            WHERE v.date BETWEEN 01-01-2013 AND NOW()
            GROUP BY v.article_id
            ORDER BY cnt DESC
            LIMIT 5
           ) a
       on p.id = a.article_id
ORDER BY views DESC

编辑:

如果我正确理解了查询,您只需将left outer join更改为join并完全取消where条款:

SELECT p.title, p.id, p.url, tmp.cnt AS views
FROM posts Ap JOIN
     (SELECT COUNT(*) AS cnt, article_id -- 0.34s
      FROM views
      WHERE article_id IN (SELECT id
                           FROM posts
                           WHERE id <> 12 AND type = 'Tutoriel') AND 
            date BETWEEN 01-01-2013 AND NOW() -- the 01-01-2013 is normally a variable but for testing purposes I've replaced it with a fixed date here
     GROUP BY article_id
     ORDER BY cnt DESC
     LIMIT 5
    ) tmp 
    ON p.id = tmp.article_id;

然后将子查询中的in更改为连接:

SELECT p.title, p.id, p.url, tmp.cnt AS views
FROM posts Ap JOIN
     (SELECT COUNT(*) AS cnt, article_id -- 0.34s
      FROM views v join
           (SELECT distinct p.id  -- distinct may not be necessary
            FROM posts p
            WHERE p.id <> 12 AND p.type = 'Tutoriel'
           ) p
           on v.rticle_id = p.id
      WHERE date BETWEEN 01-01-2013 AND NOW() -- the 01-01-2013 is normally a variable but for testing purposes I've replaced it with a fixed date here
     GROUP BY article_id
     ORDER BY cnt DESC
     LIMIT 5
    ) tmp 
    ON p.id = tmp.article_id;