如何加快查找到给定纬度/经度的最近位置的MySQL查询?

时间:2014-03-25 19:25:14

标签: mysql execution-time

我的数据库中有一个邮政编码表,它与商家信息表一起使用,以查找符合最接近指定邮政编码的特定条件的商家。我做的第一件事就是抓住纬度和经度,因为它在页面上的几个地方使用。我用:

$zipResult = mysql_fetch_array(mysql_query("SELECT latitude,longitude FROM zipCodes WHERE zipCode='".mysql_real_escape_string($_SESSION['zip'])."' Limit 1"));
$latitude = $zipResult['latitude'];
$longitude = $zipResult['longitude'];
$radius = 100;

$lon1 = $longitude - $radius / abs(cos(deg2rad($latitude))*69);
$lon2 = $longitude + $radius / abs(cos(deg2rad($latitude))*69);
$lat1 = $latitude - ($radius/69);
$lat2 = $latitude + ($radius/69);

从那里,我生成查询:

$query2 = "Select * From (SELECT business.*,zipCodes.longitude,zipCodes.latitude,
            (3956 * 2 * ASIN ( SQRT (POWER(SIN((zipCodes.latitude - $latitude)*pi()/180 / 2),2) + COS(zipCodes.latitude* pi()/180) * COS($latitude *pi()/180) * POWER(SIN((zipCodes.longitude - $longitude) *pi()/180 / 2), 2) ) )) as distance FROM business INNER JOIN zipCodes ON (business.listZip = zipCodes.zipCode)
            Where business.active = 1
            And (3958*3.1415926*sqrt((zipCodes.latitude-$latitude)*(zipCodes.latitude-$latitude) + cos(zipCodes.latitude/57.29578)*cos($latitude/57.29578)*(zipCodes.longitude-$longitude)*(zipCodes.longitude-$longitude))/180) <= '$radius'
            And zipCodes.longitude between $lon1 and $lon2 and zipCodes.latitude between $lat1 and $lat2
            GROUP BY business.id ORDER BY distance) As temp Group By category_id ORDER BY distance LIMIT 18";

结果如下:

Select * 
From (SELECT business.*,zipCodes.longitude,zipCodes.latitude, (3956 * 2 * ASIN ( SQRT (POWER(SIN((zipCodes.latitude - 39.056784)*pi()/180 / 2),2) + COS(zipCodes.latitude* pi()/180) * COS(39.056784 *pi()/180) * POWER(SIN((zipCodes.longitude - -84.343573) *pi()/180 / 2), 2) ) )) as distance 
               FROM business 
               INNER JOIN zipCodes ON (business.listZip = zipCodes.zipCode) 
               Where business.active = 1 
               And (3958*3.1415926*sqrt((zipCodes.latitude-39.056784)*(zipCodes.latitude-39.056784) + cos(zipCodes.latitude/57.29578)*cos(39.056784/57.29578)*(zipCodes.longitude--84.343573)*(zipCodes.longitude--84.343573))/180) <= '100' 
               And zipCodes.longitude between -86.2099407074 and -82.4772052926 
               and zipCodes.latitude between 37.6075086377 and 40.5060593623 
               GROUP BY business.id 
               ORDER BY distance) As temp 
Group By category_id 
ORDER BY distance 
LIMIT 18

代码运行并执行得很好,但只需要一秒钟就可以完成(通常大约1.1秒)。但是,我被告知在某些浏览器中页面加载缓慢。我测试过这是多个浏览器和这些浏览器的多个版本,但没有遇到任何问题。但是,我认为如果我可以将执行时间缩短,它将有助于任何一种方式。问题是我不知道还能做些什么来减少执行时间。邮政编码表已经带有预设索引,我认为这些索引很好(并且包含我在查询中使用的列)。我也在业务表中添加了索引,尽管我对它们并不了解。但我确保至少包括Where子句中使用的字段,可能还有一些。

如果我需要将索引添加到此问题,请告诉我。如果你在查询中看到了什么,我也可以改进,请告诉我。

谢谢, 詹姆斯

修改

business表的表结构:

CREATE TABLE IF NOT EXISTS `business` (
  `id` smallint(6) unsigned NOT NULL AUTO_INCREMENT,
  `active` tinyint(3) unsigned NOT NULL,
  `featured` enum('yes','no') NOT NULL DEFAULT 'yes',
  `topFeatured` tinyint(1) unsigned NOT NULL DEFAULT '0',
  `category_id` smallint(5) NOT NULL DEFAULT '0',
  `listZip` varchar(12) NOT NULL,
  `name` tinytext NOT NULL,
  `address` tinytext NOT NULL,
  `city` varchar(128) NOT NULL,
  `state` varchar(32) NOT NULL DEFAULT '',
  `zip` varchar(12) NOT NULL,
  `phone` tinytext NOT NULL,
  `alt_phone` tinytext NOT NULL,
  `website` tinytext NOT NULL,
  `logo` tinytext NOT NULL,
  `index_logo` tinytext NOT NULL,
  `large_image` tinytext NOT NULL,
  `description` text NOT NULL,
  `views` int(5) unsigned NOT NULL,
  PRIMARY KEY (`id`),
  KEY `featured` (`featured`,`topFeatured`,`category_id`,`listZip`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=3085 ;

SQL小提琴

http://sqlfiddle.com/#!2/2e26ff/1

编辑2014-03-26 09:09

我已经更新了我的查询,但是较短的查询实际上每次执行时间大约需要0.2秒。

Select * From (
    SELECT Distinct business.id, business.name, business.large_image, business.logo, business.address, business.city, business.state, business.zip, business.phone, business.alt_phone, business.website, business.description, zipCodes.longitude, zipCodes.latitude, (3956 * 2 * ASIN ( SQRT (POWER(SIN((zipCodes.latitude - 39.056784)*pi()/180 / 2),2) + COS(zipCodes.latitude* pi()/180) * COS(39.056784 *pi()/180) * POWER(SIN((zipCodes.longitude - -84.343573) *pi()/180 / 2), 2) ) )) as distance 
    FROM business 
    INNER JOIN zipCodes ON (business.listZip = zipCodes.zipCode) 
    Where business.active = 1 
    And zipCodes.longitude between -86.2099407074 and -82.4772052926 
    And zipCodes.latitude between 37.6075086377 and 40.5060593623 
    GROUP BY business.category_id 
    HAVING distance <= '50'
    ORDER BY distance
) As temp LIMIT 18

邮政编码数据库中的邮政编码,纬度和经度字段已有索引,这些索引都在一个索引中,每个索引都有自己的索引。这就是购买时桌子的来历。

我昨天更新了listZip数据类型以匹配邮政编码表的zip数据类型。

我确实取出GROUP BY business.id并将其替换为DISTINCT,但保留了GROUP BY business.category_id,因为我只想要每个类别一个商家。

另外,一旦我更改了查询以使用HAVING子句而不是WHERE子句中的数学公式,我就开始获得0.2秒的执行差异。我确实尝试在外部查询中使用WHERE distance <= 50,但这也没有加快任何速度。使用50英里而不是100英里似乎也没有影响这个特定的查询。

感谢目前为止的所有建议。

1 个答案:

答案 0 :(得分:1)

将索引放在zipCodes.longitudezipCodes.latitude上。这应该会有很大帮助。

有关详情,请参阅此处。 http://www.plumislandmedia.net/mysql/haversine-mysql-nearest-loc/

编辑您需要longitude上的zipCodes表中的索引,或者从longitude开始。在我看来,你应该在

上尝试复合索引
 (longitude, latitude, zipCode)

获得最佳效果。

使zipCodes.zipCode和business.listingZip的数据类型相同,因此连接将更有效。如果这些数据类型不同,MySQL会在连接时将一个类型转换为另一个,因此连接效率低下。确保business.listingZip有一个索引。

您滥用GROUP BY。 (你有没有意思SELECT DISTINCT?)除非你也使用像MAX()这样的聚合函数,否则没有意义。在类似的情况下,看看你是否可以摆脱* SELECT business.* 1}},而是提供所需列的列表。

100英里是一个非常广泛的搜索半径。缩小它以加快速度。

你计算两次大圆距离。你肯定可以重新编写查询来执行一次。