需要SQL查询性能方面的帮助

时间:2016-04-27 13:14:44

标签: mysql query-performance

我有这张桌子:

 CREATE TABLE `property_ads_history` (
 `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
 `property_ad_id` int(10) unsigned NOT NULL,
 `advertiser_type` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `agency_reference_id` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `average_sale_price` double(8,2) NOT NULL DEFAULT '-1.00',
 `bathrooms` double(8,2) NOT NULL DEFAULT '-1.00',
 `bedrooms` double(8,2) NOT NULL DEFAULT '-1.00',
 `carports` double(8,2) NOT NULL DEFAULT '-1.00',
 `DELETE_country` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `created_reason` enum('Geocoded','Sanitized Parking','Sanitized Representation','Sanitized Address','Scraped','URL Inserted','QA Sanitized Address','QA Sanitized Representation','QA Sanitized Parkings') COLLATE utf8_unicode_ci DEFAULT NULL,
 `description` longtext COLLATE utf8_unicode_ci NOT NULL,
 `ensuite_bathrooms` double(8,2) NOT NULL DEFAULT '-1.00',
 `DELETE_ad_expired_at` datetime NOT NULL,
 `floor_area` double(8,2) NOT NULL DEFAULT '-1.00',
 `formatted_address` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `garages` double(8,2) NOT NULL DEFAULT '-1.00',
 `geocode_status` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
 `is_represented` tinyint(1) DEFAULT NULL,
 `land_area` double(8,2) NOT NULL DEFAULT '-1.00',
 `latitude` double(10,6) NOT NULL,
 `location_id` int(10) unsigned DEFAULT NULL,
 `longitude` double(10,6) NOT NULL,
 `off_street_parkings` double(8,2) NOT NULL DEFAULT '-1.00',
 `official_property_id` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `page_url` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `parking` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `posted_at` datetime NOT NULL,
 `posted_at_string` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `postal_code` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `price` double(10,2) NOT NULL DEFAULT '-1.00',
 `primary_image` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `DELETE_property_ad_created_at` datetime NOT NULL,
 `property_type` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `rateable_value` double(10,2) NOT NULL DEFAULT '-1.00',
 `recent_sale_1` double(10,2) NOT NULL DEFAULT '-1.00',
 `recent_sale_2` double(10,2) NOT NULL DEFAULT '-1.00',
 `recent_sale_3` double(10,2) NOT NULL DEFAULT '-1.00',
 `reviewer_comments` longtext COLLATE utf8_unicode_ci NOT NULL,
 `route` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `source_id` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
 `street_number` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `title` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `user_id` int(10) unsigned DEFAULT NULL,
 `user_provided_address` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
 `updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
 `deleted_at` timestamp NULL DEFAULT NULL,
 PRIMARY KEY (`id`),
 UNIQUE KEY `unique_index` (`created_at`,`created_reason`,`source_id`),
 KEY `property_ads_history_property_ad_id_foreign` (`property_ad_id`),
 KEY `property_ads_history_location_id_foreign` (`location_id`),
 KEY `created_reason` (`created_reason`)
);

这是我的SQL查询:

SELECT * FROM `property_ads_history` `t1` 
WHERE `t1`.`created_at` >= '2016-04-13 00:00:00' 
AND `t1`.`created_reason` = 'Scraped' 
AND (`t1`.`price` > -1 OR `t1`.`price` <> 999999.99) 
AND (
  SELECT `t2`.`price` FROM `property_ads_history` `t2` 
  WHERE `t2`.`property_ad_id` = `t1`.`property_ad_id` 
  AND `t2`.`created_at` < `t1`.`created_at` 
  AND (`t2`.`price` > -1 OR `t2`.`price` <> 999999.99) ORDER BY  
  DATE(`t2`.`created_at`) DESC LIMIT 1
) <> `t1`.`price` GROUP BY `t1`.`property_ad_id`;

我在查询中想要的只有:

  1. 我想获取符合特定条件的记录,尤其是created_at字段在过去24小时内

  2. 我需要获得紧接在#1

  3. 中记录之前的记录
  4. 进一步过滤结果#1,记录的价格列在记录的历史记录中具有不同的值,而不是-1

  5. 我遇到的问题是查询太慢了。我有一百万条记录。执行查询需要或多或少2分钟。我猜测GROUP BY使查询变慢。我想我需要制作综合索引,但我不知道该怎么做。

    解释结果:

    
    id | select_type        | table | type | possible_keys                                             | key                                         | key_len | ref                                | rows   | Extra
    1  | PRIMARY            | t1    | ref  | unique_index,created_reason                               | created_reason                              | 2       | const                              | 590030 | Using where; Using temporary; Using filesort
    2  | DEPENDENT SUBQUERY | t2    | ref  | unique_index,property_ads_history_property_ad_id_foreign  | property_ads_history_property_ad_id_foreign | 4       | reis_real_estate.t1.property_ad_id | 7      | Using where; Using filesort

1 个答案:

答案 0 :(得分:0)

使用连接而不是使用子查询,它将减少查询时间。 以下查询可能对您有帮助。

select  t1.*, t2.price
    from  `property_ads_history` AS `t1`
    INNER JOIN `property_ads_history` As `t2`
    where  `t1`.`created_at` >= '2016-04-13 00:00:00'
      AND  `t1`.`created_reason` = 'Scraped'
      AND  (`t1`.`price` > -1 OR  `t1`.`price` <> 999999.99 )
      AND  `t2`.`created_at` < `t1`.`created_at`
      AND  (`t2`.`price` > -1 OR  `t2`.`price` <> 999999.99 )
    ORDER BY  DATE(`t2`.`created_at`) DESC
    GROUP BY  `t1`.`property_ad_id`;

此外,在列上放置索引也会缩短查询时间。