我陷入了一个相当复杂的问题。
我正在寻找一个显示"前五大客户"以及关于每个客户的一些关键指标(计算条件)。每个不同的指标都使用完全不同的连接结构。
+-----------+------------+ +-----------+------------+ +-----------+------------+
| customer | | | metricn | | | metricn_lineitem |
+-----------+------------+ +-----------+------------+ +-----------+------------+
| id | Name | | id | customer_id| |id |metricn_id |
| 1 | Customer1 | | 1 | 1 | | 1 | 1 |
| 2 | Customer2 | | 2 | 2 | | 2 | 1 |
+-----------+------------+ +-----------+------------+ +-----------+------------+
问题是我总是希望按此客户表进行分组。
我首先尝试将所有连接放入原始查询中,但查询性能极差。然后我尝试使用子查询,但我无法按照原来的医院ID对他们进行分组。
这是一个示例查询
SELECT
customer.name,
(SELECT COUNT(metric1_lineitem.id)
FROM metric1 INNER JOIN metric1_lineitem
ON metric1_lineitem.metric1_id = metric1.id
WHERE metric1.customer_id = customer_id
) as metric_1,
(SELECT COUNT(metric2_lineitem.id)
FROM metric2 INNER JOIN metric2_lineitem
ON metric2_lineitem.metric2_id = metric2.id
WHERE metric2.customer_id = customer_id
) as metric_2
FROM customer
GROUP BY customer.name
SORT BY COUNT(metric1.id) DESC
LIMIT 5
有什么建议吗?谢谢!
答案 0 :(得分:1)
SELECT name, metric_1, metric_2
FROM customer AS c
LEFT JOIN (SELECT customer_id, COUNT(*) AS metric_1
FROM metric1 AS m
INNER JOIN metric1_lineitem AS l ON m.id = l.metric1_id
GROUP BY customer_id) m1
ON m1.customer_id = c.customer_id
LEFT JOIN (SELECT customer_id, COUNT(*) AS metric_2
FROM metric2 AS m
INNER JOIN metric2_lineitem AS l ON m.id = l.metric2_id
GROUP BY customer_id) m1
ON m2.customer_id = c.customer_id
ORDER BY metric_1 DESC
LIMIT 5
当您可以使用COUNT(columnname)
时,也应该避免使用COUNT(*)
。前者必须测试每个值,看它是否为空。
答案 1 :(得分:1)
虽然您的数据结构可能很糟糕,但您的查询可能不会那么糟糕,但有两个例外。我不认为你需要外层的聚合。此外,where
子句中的“关联”(例如metric1.customer_id = customer_id
)没有做任何事情,因为customer_id
来自本地表。您需要metric1.customer_id = c.customer_id
:
SELECT c.name,
(SELECT COUNT(metric1_lineitem.id)
FROM metric1 INNER JOIN
metric1_lineitem
ON metric1_lineitem.metric1_id = metric1.id
WHERE metric1.customer_id = c.customer_id
) as metric_1,
(SELECT COUNT(metric2_lineitem.id)
FROM metric2 INNER JOIN
metric2_lineitem
ON metric2_lineitem.metric2_id = metric2.id
WHERE metric2.customer_id = c.customer_id
) as metric_2
FROM customer c
ORDER BY 1 DESC
LIMIT 5;
如何让这次跑得更快?一种方法是引入索引。我建议metric1(customer_id)
,metric2(customer_id)
,metric1_lineitem(metric1_id)
和metric2_lineitem(metric2_id)
。
这可能比聚合方法(由Barmar提出)更快,因为MySQL在聚合方面效率低下。这应该允许仅使用索引而不是基表进行聚合。