JOINed表上的GROUP BY和ORDER BY - 复杂且缓慢

时间:2013-11-25 11:41:47

标签: php mysql sql

故事是这样的......我有Users他们有Children。 我想每天使用CRON JOB优惠券向在子女出生日期之间生孩子的用户发送。 我想知道谁将是用户获得优惠券和哪个孩子。 我也想为每个孩子只寄一张优惠券,孩子必须是用户最年轻的。

我有以下表格

Children
+--------------------------------------+
- Primary Key: childrenID (int)
- Index: userID (int)
- Index: childBirthDate (date)
+--------------------------------------+
- childrenID - userID - childBirthDate -
- 1          - 1      - 21/01/2000     -
- 2          - 1      - 01/11/2013     -
- 3          - 1      - 25/10/2013     -
- 4          - 2      - 01/11/2013     -
- 5          - 3      - 01/11/2013     -
+--------------------------------------+

Users
+------------------------+
- Primary Key: userID (int)
- Index: categoryGroup (varchar)
+------------------------+
- userID - categoryGroup -
- 1      - 'Group1'      -
- 2      - 'Group1'      -
- 3      - 'Group2'      -
- 4      - 'Group2'      -
+------------------------+

CuponRequests
+------------------------+
- Primary Key: ID (int)
- Index: userID (int)
- Index: cuponID (int)
+-----------------------+
- ID - cuponID - userID -
- 1  - 1       - 1      -
- 1  - 2       - 1      -
- 1  - 1       - 2      -
+-----------------------+

这基本上是带有相关列的三个主表 我有以下SQL查询来执行和获取我需要的结果。

SELECT users.userID,
       users.categoryGroup children.childBirthDate,
       children.childrenID
FROM users,
  (SELECT *
   FROM
     (SELECT children.childrenID,
             children.childBirthDate,
             users.userID AS child_uid
      FROM children,
           users
      WHERE children.userID = users.userID
      ORDER BY children.childBirthDate DESC)t1
   GROUP BY child_uid)children
WHERE (children.childBirthDate <= DATE_SUB(CURDATE(), INTERVAL 5 MONTH))
  AND (children.childBirthDate > DATE_SUB(CURDATE() , INTERVAL 6 MONTH))
  AND (children.child_uid = users.userID)
  AND ('Group1, Group2' LIKE CONCAT('%', users.categoryGroup, '%'))
  AND NOT EXISTS
    (SELECT userID,
            cuponID
     FROM cuponRequests
     WHERE userID = users.userID
       AND cuponID = 1)
  AND userID = 1
ORDER BY children.childBirthDate DESC

对于此查询,我尝试仅在一个用户和一个优惠券上工作 但它的自然行为 - 查询正在对所有用户

“cuponID”,以及间隔,来自脚本的PHP端 - 我迭代“cupons”表(这里没有提到)并在每个“优惠券”行上执行此查询)

问题是此查询正在执行约1.5秒(O.O) 除了在CRON JOB环境中运行此脚本之外,此脚本也会在用户注册到网站后立即运行。我有96个杯子 - 这会使注册速度减慢大约一分钟(这很多)


我认为这个查询

SELECT *
FROM
  (SELECT children.childrenID,
          children.childBirthDate,
          users.userID AS child_uid
   FROM children,
        users
   WHERE children.userID = users.userID
   ORDER BY children.childBirthDate DESC)t1
GROUP BY child_uid

减慢了速度。我尝试在选择查询中执行JOIN而不是选择查询,如下所示:

FROM users LEFT JOIN children ON children.userID = users.userID

但是后来我失去了“ORDER BY childBirthDate DESC”以获得该用户的最小孩子而且我失去了“GROUP BY child_uid”以仅获得他的一个孩子

任何想法如何让事情更快但仍然有效?

P.S 抱歉我缺乏英语。


编辑:

这是EXPLAIN SQL的输出

+----+--------------------+---------------+-------+----------------+---------+---------+------------------------------+-------+-----------------------------------------------------+
| id |    select_type     |     table     | type  | possible_keys  |   key   | key_len |             ref              | rows  |                        Extra                        |
+----+--------------------+---------------+-------+----------------+---------+---------+------------------------------+-------+-----------------------------------------------------+
|  1 | PRIMARY            | NULL          | NULL  | NULL           | NULL    | NULL    | NULL                         | NULL  | Impossible WHERE noticed after reading const tables |
|  4 | DEPENDENT SUBQUERY | cuponRequests | ref   | userID,cuponID | userID  | 5       | const                        | 1     | Using where                                         |
|  2 | DERIVED            | <derived3>    | ALL   | NULL           | NULL    | NULL    | NULL                         | 73526 | Using temporary; Using filesort                     |
|  3 | DERIVED            | users         | index | PRIMARY        | PRIMARY | 4       | NULL                         | 69271 | Using index; Using temporary; Using filesort        |
|  3 | DERIVED            | children      | ref   | userID         | userID  | 4       | users.userID                 | 1     |                                                     |
+----+--------------------+---------------+-------+----------------+---------+---------+------------------------------+-------+-----------------------------------------------------+

1 个答案:

答案 0 :(得分:1)

此查询应该更快。我已经改变了出生日期的条件。

SELECT *
FROM
  (SELECT children.childrenID,
          children.childBirthDate,
          users.userID AS child_uid
   FROM children,
        users
   WHERE children.userID = users.userID
   AND children.childBirthDate <= DATE_SUB(CURDATE(), INTERVAL 5 MONTH)
   AND children.childBirthDate > DATE_SUB(CURDATE() , INTERVAL 6 MONTH)
   ORDER BY children.childBirthDate DESC)t1
GROUP BY child_uid

修改

我能写的最快的形式的完整查询。我已从%中删除LIKE,将子查询更改为联接并删除*。关于出生日期的条件也会被移动。但是可能会有错误。

SELECT users.userID,
   users.categoryGroup, children.childBirthDate,
   children.childrenID
FROM
  (SELECT MIN(childBirthDate) AS childBirthDate, userID
      FROM children
      WHERE childBirthDate <= DATE_SUB(CURDATE(), INTERVAL 5 MONTH)
      AND childBirthDate > DATE_SUB(CURDATE() , INTERVAL 6 MONTH)
      GROUP BY userID) AS ch1
  INNER JOIN users ON users.userID = ch1.userID
  INNER JOIN children ON users.userID = children.userID AND ch1.childBirthDate = children.childBirthDate
  LEFT JOIN CuponRequests ON CuponRequests.userID = userID AND cuponID = 1
  WHERE ('Group1' LIKE users.categoryGroup OR 'Group2' LIKE users.categoryGroup)
  AND CuponRequest.ID IS NULL
  AND userID = 1
ORDER BY children.childBirthDate DESC

详细说明

  • 子查询可能很慢。有时优化器将无法做正确的事情。使用ON子句编写联接应该更安全。
  • GROUP BY的语句对于优化器来说更加复杂。在其中写入其他条件可能会有所帮助。
  • 使用LIKE '%something%'语句的索引非常困难。 LIKE 'something%'LIKE 'something'要快得多。
  • 有时候最好将*更改为所需参数的显式列表。有时所有需要的信息都在索引中,不需要直接从表中读取。它可能对角落的情况有所帮助。