Question

TL; DR
查询同一张表时，为什么下面的第一个查询（扫描20行）比第二个查询（扫描35k +行）要花费更长的时间？

第一个查询：

id  select_type     table       type         possible_keys     key                            key_len     ref     rows    Extra
1   SIMPLE          Groups      range        <lots of keys>    group_name_ip_address_idx      317         NULL    20      Using index condition; Using where

第二个查询：

id  select_type     table       type         possible_keys     key                    key_len     ref     rows    Extra
1   SIMPLE          Groups      ref_or_null  <lots of keys>    email_address_idx      768         const   35415   Using index condition; Using where

我将解释中的“行”用作查询性能的直接指标（这可能是错误的吗？），但是20行查询所花的时间比35k查询要长得多。不是SQL专家，有人可以教育我可能导致这种情况的原因吗？

长版：
我有一个表“ Groups”，其中有一个字段“ group_name”和另外20个有关客户信息的字段（“ field_1”，“ field_2”，...，“ field_20”）。

我正在使用此表来确定当我看到客户的信息时，它与哪个组匹配。举一个虚拟的例子，如果表中有一个记录，其中组名是“ US male”，而除了“ citizenship”是“ US”而“ gender”是“ male”之外，这20个字段全为空，那么这意味着每当我看到具有“美国”公民身份和“男性”性别的客户，他与“美国男性”组匹配。

我正在使用此查询（Query1），它需要3到5毫秒才能实现此目标：

select * from Groups 
where group_name = "US male" 
  and (field_1 = "something1" or field_1 is null) 
  and (field_2 = "something2" or field_2 is null) 
  and ... and (field_20 = "something20" or field_20 is null)

那些“东西”代表当前客户的信息，我想知道他/她匹配哪些组。因此，如果此查询返回任何内容，则表示匹配；否则不匹配。

上述查询的解释输出：

id  select_type     table       type         possible_keys     key                               key_len     ref           rows    Extra
1   SIMPLE          Groups      ref_or_null  <lots of keys>    group_name_email_address_idx      962         const,const   2       Using index condition; Using where

请注意，一个客户可以匹配多个组，因此对于N个组名，我将需要上述N个查询。现在，随着N越来越大，我想使用一个查询而不是N个小查询来完成相同的事情，这就是我遇到的问题。

我首先尝试删除group_name = "XXXX"中的where条件-选择所有匹配的组，而不是一个一个地检查（Query2）。

select * from Groups 
where (field_1 = "something1" or field_1 is null) 
  and (field_2 = "something2" or field_2 is null) 
  and ... and (field_20 = "something20" or field_20 is null)

说明输出：

id  select_type     table       type         possible_keys     key                    key_len     ref     rows    Extra
1   SIMPLE          Groups      ref_or_null  <lots of keys>    email_address_idx      768         const   35415   Using index condition; Using where

这很慢（〜70ms），因为它不能使用任何需要组名的索引，而这些索引是最有效的，因为group_name的基数最低。（扫描所需的行数为35k，而第一个查询为2）。因此，效果不是很好。

然后为了使查询使用组名索引，我在group_name in (<all group names>)（Query3）中添加了where：

select * from Groups 
where group_name in ("group1", "group2", ..., "groupN") 
  and (field_1 = "something1" or field_1 is null) 
  and (field_2 = "something2" or field_2 is null) 
  and ... and (field_20 = "something20" or field_20 is null)

说明输出：

id  select_type     table       type         possible_keys     key                            key_len     ref     rows    Extra
1   SIMPLE          Groups      range        <lots of keys>    group_name_ip_address_idx      317         NULL    20      Using index condition; Using where

我看到它需要扫描的行是20，这比35415好得多，因此我期望它运行的很快。但是，当我尝试运行它时，它实际上要比Query2花费更多的时间（大约20倍）破坏了我的服务。

毕竟，我现在非常困惑，为什么扫描20行的查询要比扫描35,000行的查询花费更长的时间？我看错了解释输出吗？

sql-为什么要扫描的行较少（根据解释）的查询实际上比行多的查询运行得慢得多？

0 个答案: