Question

我有一个强制对innodb表进行完整索引扫描的查询 - 这是预期的，但是，性能仍然比预期慢得多。该表的结构如下：

Field   Type    Null    Key
CUSTOMER_ID int(11) NO  MUL
CustLatitude    decimal(15,12)  YES 
CustLongitude   decimal(15,12)  YES 
StoreLatitude   decimal(15,12)  NO  
StoreLongitude  decimal(15,12)  NO  
StoreID int(11) NO  MUL
Distance    double  YES MUL

对于每个CUSTOMER_ID，我选择包含最小距离值的行，如下所示：

select 
distinct(CUSTOMER_ID) as incustid, 
(select StoreID from CustomerStoreDistance where CUSTOMER_ID = incustid 
    order by Distance ASC limit 1) as closeststoreid 
from
CustomerStoreDistance;

如上所示，CUSTOMER_ID，Distance和StoreID上有索引。 CustomerStoreDistance表中大约有43M行，在RDS上运行，带有一个带有244 GB RAM和32vCPU的db.cr1.8xlarge类机器。

参数已经根据我的排序，临时空间等方面进行了优化。但是，如果有更好的方法和/或更多的优化，我很好奇。

谢谢，乍得

Answer 1

如果您有一个单独的客户表，我认为您会做得更好。无论如何，请尝试以下版本的查询：

select incustid,
       (select StoreId
        from CustomerStoreDistinct csd2
        where csd2.CustomerId = csd.incustid
        order by distance
        limit 1
       ) as ClosestStoreId
from (select distinct CustomerId as incustid
      from CustomerStoreDistance
     ) csd;

子查询有助于避免distinct和子查询之间的混淆。我认为MySQL会在不同之前执行子查询，这只是浪费精力。

要优化此查询，您需要CustomerStoreDistance(CustomerId, Distance, StoreId)上的综合索引。

编辑：

因为这些查询已经需要聚合来消除重复，所以这可能会更好：

select CustomerId,
       substring_index(group_concat(StoredId order by Distance), ',', 1) as ClosestStoreId
from CustomerStoreDistance
group by CustomerId;

MYSQL查询性能与同一个表的区别，排序和限制

1 个答案: