使大型SQL查询高效

时间:2014-01-14 21:25:01

标签: mysql sql

我陷入了一个相当复杂的问题。

我正在寻找一个显示"前五大客户"以及关于每个客户的一些关键指标(计算条件)。每个不同的指标都使用完全不同的连接结构。

+-----------+------------+   +-----------+------------+    +-----------+------------+
| customer  |            |   | metricn   |            |    | metricn_lineitem       | 
+-----------+------------+   +-----------+------------+    +-----------+------------+
| id        | Name       |   | id        | customer_id|    |id         |metricn_id  |
| 1         | Customer1  |   | 1         | 1          |    | 1         | 1          |
| 2         | Customer2  |   | 2         | 2          |    | 2         | 1          |
+-----------+------------+   +-----------+------------+    +-----------+------------+

问题是我总是希望按此客户表进行分组。

我首先尝试将所有连接放入原始查询中,但查询性能极差。然后我尝试使用子查询,但我无法按照原来的医院ID对他们进行分组。

这是一个示例查询

SELECT 
     customer.name, 

     (SELECT COUNT(metric1_lineitem.id) 
      FROM metric1 INNER JOIN metric1_lineitem 
      ON metric1_lineitem.metric1_id = metric1.id
      WHERE metric1.customer_id = customer_id
      ) as metric_1,

     (SELECT COUNT(metric2_lineitem.id) 
      FROM metric2 INNER JOIN metric2_lineitem 
      ON metric2_lineitem.metric2_id = metric2.id
      WHERE metric2.customer_id = customer_id
      ) as metric_2

FROM customer
GROUP BY customer.name
SORT BY COUNT(metric1.id) DESC
LIMIT 5

有什么建议吗?谢谢!

2 个答案:

答案 0 :(得分:1)

SELECT name, metric_1, metric_2
FROM customer AS c
LEFT JOIN (SELECT customer_id, COUNT(*) AS metric_1
           FROM metric1 AS m
           INNER JOIN metric1_lineitem AS l ON m.id = l.metric1_id
           GROUP BY customer_id) m1
ON m1.customer_id = c.customer_id
LEFT JOIN (SELECT customer_id, COUNT(*) AS metric_2
           FROM metric2 AS m
           INNER JOIN metric2_lineitem AS l ON m.id = l.metric2_id
           GROUP BY customer_id) m1
ON m2.customer_id = c.customer_id
ORDER BY metric_1 DESC
LIMIT 5

当您可以使用COUNT(columnname)时,也应该避免使用COUNT(*)。前者必须测试每个值,看它是否为空。

答案 1 :(得分:1)

虽然您的数据结构可能很糟糕,但您的查询可能不会那么糟糕,但有两个例外。我不认为你需要外层的聚合。此外,where子句中的“关联”(例如metric1.customer_id = customer_id)没有做任何事情,因为customer_id来自本地表。您需要metric1.customer_id = c.customer_id

SELECT c.name, 
       (SELECT COUNT(metric1_lineitem.id) 
        FROM metric1 INNER JOIN
             metric1_lineitem 
             ON metric1_lineitem.metric1_id = metric1.id
        WHERE metric1.customer_id = c.customer_id
      ) as metric_1,
      (SELECT COUNT(metric2_lineitem.id) 
       FROM metric2 INNER JOIN
            metric2_lineitem 
            ON metric2_lineitem.metric2_id = metric2.id
       WHERE metric2.customer_id = c.customer_id
      ) as metric_2
FROM customer c
ORDER BY 1 DESC
LIMIT 5;

如何让这次跑得更快?一种方法是引入索引。我建议metric1(customer_id)metric2(customer_id)metric1_lineitem(metric1_id)metric2_lineitem(metric2_id)

这可能比聚合方法(由Barmar提出)更快,因为MySQL在聚合方面效率低下。这应该允许仅使用索引而不是基表进行聚合。