Question

第一个答案是：

SELECT students.student_id,student_name,father_name,mother_name,
           COUNT(student_addresses.student_id) AS total_addresses,    
           COUNT(student_phones.student_id) AS total_phones
     FROM students,student_phones,student_addresses
     WHERE students.student_id = student_phones.student_id AND
           students.student_id = student_addresses.student_id AND
           students.student_id = 7
    GROUP BY BY students.student_id,student_name,father_name,mother_name;

而第二个是：

SELECT s.student_id,
       max(s.student_name) student_name,
       max(s.father_name) father_name,
       max(s.mother_name) mother_name,
       COUNT(distinct a.student_address_id) total_addresses,    
       COUNT(distinct p.student_phone_id) total_phones
FROM students s
LEFT JOIN student_phones p ON s.student_id = p.student_id
LEFT JOIN student_addresses a ON s.student_id = a.student_id
WHERE s.student_id = 7
GROUP BY s.student_id

现在，问题是：在性能方面，两个查询之间是否存在显着差异？使用MAX()会影响第二个查询的执行时间吗？

我尝试谷歌寻求答案，但没有运气。我想要一个明确而具体的解释。

Answer 1

即使四列都是唯一的（students.student_id，student_name，father_name，mother_name），这两个查询也没有做同样的事情。

从逻辑角度来看，这两个查询并不相同。对于没有电话或没有地址的学生，第一个将不返回任何行。第二个将返回这样的学生。此外，计数值也不同（取决于数据）。

从绩效角度来看，主要区别在于：

       COUNT(student_addresses.student_id) AS total_addresses,    
       COUNT(student_phones.student_id) AS total_phones

与

       COUNT(distinct student_addresses.student_id) AS total_addresses,    
       COUNT(distinct student_phones.student_id) AS total_phones

使用count(distinct)更加昂贵，因为SQL引擎必须维护所有值的列表。在极端情况下，这些值可能会超出内存，甚至会导致更多的I / O操作。对于count()，引擎只是将一个添加到数字而不是进行繁琐的列表操作。

同样，min()和max()的开销很小 - 引擎进行比较并覆盖值。这是一小部分额外工作，不太可能影响性能。平衡这一点是group by密钥更短的事实。较短的键可能会影响性能，具体取决于所使用的基础算法。无论如何，两个查询都具有group by处理的相同数据量，因此密钥长度的总体差异（无论算法如何）都可能是最小的。

简而言之，性能的任何差异都归因于count(distinct)而不是max()。您应该决定这是否是您真正需要的并根据您的要求编写查询。第二种形式更好，因为它使用ANSI标准连接语法。

MAX（）是否提高了查询的性能？

1 个答案: