为什么我的更新...在哪里...订购...限制1声明需要这么长时间?

时间:2014-06-02 19:49:17

标签: mysql sql database optimization query-optimization

我正在尝试改进我的查询,以便它不需要这么长时间。有什么我可以尝试的吗?

我正在使用InnoDB。

我的表:

mysql> describe hunted_place_review_external_urls;
+--------------+--------------+------+-----+---------+----------------+
| Field        | Type         | Null | Key | Default | Extra          |
+--------------+--------------+------+-----+---------+----------------+
| id           | bigint(20)   | NO   | PRI | NULL    | auto_increment |
| worker_id    | varchar(255) | YES  | MUL | NULL    |                |
| queued_at    | bigint(20)   | YES  | MUL | NULL    |                |
| external_url | varchar(255) | NO   |     | NULL    |                |
| place_id     | varchar(63)  | NO   | MUL | NULL    |                |
| source_id    | varchar(63)  | NO   |     | NULL    |                |
| successful   | tinyint(1)   | NO   |     | 0       |                |
+--------------+--------------+------+-----+---------+----------------+

mysql> show index from hunted_place_review_external_urls;
+-----------------------------------+------------+--------------------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table                             | Non_unique | Key_name                                   | Seq_in_index | Column_name  | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------------------------+------------+--------------------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| hunted_place_review_external_urls |          0 | PRIMARY                                    |            1 | id           | A         |     5118685 |     NULL | NULL   |      | BTREE      |         |               |
| hunted_place_review_external_urls |          1 | worker_id                                  |            1 | worker_id    | A         |     5118685 |     NULL | NULL   | YES  | BTREE      |         |               |
| hunted_place_review_external_urls |          1 | queued_at                                  |            1 | queued_at    | A         |     5118685 |     NULL | NULL   | YES  | BTREE      |         |               |
| hunted_place_review_external_urls |          1 | worker_id_and_queued_at                    |            1 | worker_id    | A         |     5118685 |     NULL | NULL   | YES  | BTREE      |         |               |
| hunted_place_review_external_urls |          1 | worker_id_and_queued_at                    |            2 | queued_at    | A         |     5118685 |     NULL | NULL   | YES  | BTREE      |         |               |
| hunted_place_review_external_urls |          1 | place_id_source_id_external_url_successful |            1 | place_id     | A         |     5118685 |     NULL | NULL   |      | BTREE      |         |               |
| hunted_place_review_external_urls |          1 | place_id_source_id_external_url_successful |            2 | source_id    | A         |     5118685 |     NULL | NULL   |      | BTREE      |         |               |
| hunted_place_review_external_urls |          1 | place_id_source_id_external_url_successful |            3 | external_url | A         |     5118685 |     NULL | NULL   |      | BTREE      |         |               |
| hunted_place_review_external_urls |          1 | place_id_source_id_external_url_successful |            4 | successful   | A         |     5118685 |     NULL | NULL   |      | BTREE      |         |               |
+-----------------------------------+------------+--------------------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

我的查询:

mysql> select count(*) from hunted_place_review_external_urls;
+----------+
| count(*) |
+----------+
|  4217356 |
+----------+
1 row in set (0.96 sec)

mysql> select count(*) from hunted_place_review_external_urls where worker_id is null;
+----------+
| count(*) |
+----------+
|   772626 |
+----------+
1 row in set (0.27 sec)

mysql> update hunted_place_review_external_urls set worker_id = "123" where worker_id is null order by queued_at asc limit 1;
Query OK, 1 row affected (4.80 sec)
Rows matched: 1  Changed: 1  Warnings: 0

为什么即使我在queued_atworker_id上同时拥有单一索引和复合索引,更新查询也会占用4秒?当worker_id = null的行数低得​​多时,这种情况从未发生过。使用~20万行而不是780k行,只需几毫秒。

请注意,使用SELECT代替UPDATE的等效查询非常快:

mysql> select * from hunted_place_review_external_urls where worker_id is null order by  queued_at asc limit 1;
1 row in set (0.00 sec)

我的queued_at值是以毫秒数表示的时间戳,例如1398210069531

我尝试在worker_idqueued_at上删除我的单个索引,但问题仍然存在:

mysql> drop index queued_at on hunted_place_review_external_urls;
Query OK, 0 rows affected (3.75 sec)

mysql> drop index worker_id on hunted_place_review_external_urls;
Query OK, 0 rows affected (3.75 sec)

mysql> show index from hunted_place_review_external_urls;
+-----------------------------------+------------+--------------------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table                             | Non_unique | Key_name                                   | Seq_in_index | Column_name  | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------------------------+------------+--------------------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| hunted_place_review_external_urls |          0 | PRIMARY                                    |            1 | id           | A         |     5118685 |     NULL | NULL   |      | BTREE      |         |               |
| hunted_place_review_external_urls |          1 | worker_id_and_queued_at                    |            1 | worker_id    | A         |     5118685 |     NULL | NULL   | YES  | BTREE      |         |               |
| hunted_place_review_external_urls |          1 | worker_id_and_queued_at                    |            2 | queued_at    | A         |     5118685 |     NULL | NULL   | YES  | BTREE      |         |               |
| hunted_place_review_external_urls |          1 | place_id_source_id_external_url_successful |            1 | place_id     | A         |     5118685 |     NULL | NULL   |      | BTREE      |         |               |
| hunted_place_review_external_urls |          1 | place_id_source_id_external_url_successful |            2 | source_id    | A         |     5118685 |     NULL | NULL   |      | BTREE      |         |               |
| hunted_place_review_external_urls |          1 | place_id_source_id_external_url_successful |            3 | external_url | A         |     5118685 |     NULL | NULL   |      | BTREE      |         |               |
| hunted_place_review_external_urls |          1 | place_id_source_id_external_url_successful |            4 | successful   | A         |     5118685 |     NULL | NULL   |      | BTREE      |         |               |
+-----------------------------------+------------+--------------------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

这是我的EXPLAIN SELECT声明。我正在使用不支持EXPLAIN UPDATE的旧版MYSQL:

mysql> explain select * from hunted_place_review_external_urls where worker_id is null order by queued_at asc limit 1;
+----+-------------+-----------------------------------+------+-------------------------+-------------------------+---------+-------+---------+-------------+
| id | select_type | table                             | type | possible_keys           | key                     | key_len | ref   | rows    | Extra       |
+----+-------------+-----------------------------------+------+-------------------------+-------------------------+---------+-------+---------+-------------+
|  1 | SIMPLE      | hunted_place_review_external_urls | ref  | worker_id_and_queued_at | worker_id_and_queued_at | 768     | const | 1587282 | Using where |
+----+-------------+-----------------------------------+------+-------------------------+-------------------------+---------+-------+---------+-------------+
1 row in set (0.00 sec)

2 个答案:

答案 0 :(得分:0)

这是您的查询:

update hunted_place_review_external_urls
    set worker_id = "123"
    where worker_id is null
    order by queued_at asc
    limit 1;

首先必须找到用于更新的行,这需要应用where子句和order by子句。它可以完成所有工作(扫描表然后排序),也可以使用索引。正确的索引是hunted_place_review_external_urls(worker_id, queued_at)。您可以在末尾添加更多列,但这些列必须是前两列,并按此顺序。

编辑:

鉴于select速度很快,请尝试以下版本:

update hunted_place_review_external_urls toupdate join
       (select
        from hunted_place_review_external_urls
        where worker_id is null
        order by queued_at asc
        limit 1
       ) l
       on toupdate.id = l.id
    set toupdate.worker_id = '123';

我不确定为什么这些索引会在update中正确使用,但希望这会有效。

答案 1 :(得分:0)

比较Drop IndexCreate Indexupdate的时间。 您可能会注意到相关性。

  • 当您执行简单SELECT查询时,索引 USEFUL ,会导致速度加快。

  • 当您执行UPDATEDELETE语句时 - 索引很糟糕,导致速度变慢!每当您更改索引列的值时,MySQL都需要为任何后续行重建索引。 (假设您始终获取最旧的条目 - 这意味着:重新索引所有重新获取772625行。)

尝试删除worker_id上的索引并查看更新效果。如果未对worker_id编制索引,则更新速度会更快。 (查找条目更新仍然和以前一样快,因为它主要取决于对索引列queued_at执行的排序以及非索引null值的一小部分在worker_id上,匹配所需的queued_at


我刚刚创建了一个虚拟数据库并测试了你的设置: 使用1.000.000行和两者 - worker_id上的单个索引和worker_id|queued at上的复合索引,选择如下:

SELECT * FROM `tbl` WHERE ISNULL( worker_ID ) ORDER BY queued_at ASC LIMIT 1 

和表现:

Query took 0.3360 sec

尝试以您的方式修改worker_id,结果如下:

UPDATE `tbl` SET worker_id=1 WHERE ISNULL(worker_ID) ORDER BY queued_at ASC LIMIT 1

性能:

1 row affected. (Query took 7.9592 sec)

删除worker_id上的索引(单个和复合),然后会产生相同的查询:

1 row affected. (Query took 1.4364 sec) 

(每个插入生成50.000行,因此它们具有相同的日期,因此索引不是&#34;完美&#34;用于搜索,因此&#34;真实&#34;数据可能表现得更好。)< / p>