我正在尝试改进我的查询,以便它不需要这么长时间。有什么我可以尝试的吗?
我正在使用InnoDB。
我的表:
mysql> describe hunted_place_review_external_urls;
+--------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+----------------+
| id | bigint(20) | NO | PRI | NULL | auto_increment |
| worker_id | varchar(255) | YES | MUL | NULL | |
| queued_at | bigint(20) | YES | MUL | NULL | |
| external_url | varchar(255) | NO | | NULL | |
| place_id | varchar(63) | NO | MUL | NULL | |
| source_id | varchar(63) | NO | | NULL | |
| successful | tinyint(1) | NO | | 0 | |
+--------------+--------------+------+-----+---------+----------------+
mysql> show index from hunted_place_review_external_urls;
+-----------------------------------+------------+--------------------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------------------------+------------+--------------------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| hunted_place_review_external_urls | 0 | PRIMARY | 1 | id | A | 5118685 | NULL | NULL | | BTREE | | |
| hunted_place_review_external_urls | 1 | worker_id | 1 | worker_id | A | 5118685 | NULL | NULL | YES | BTREE | | |
| hunted_place_review_external_urls | 1 | queued_at | 1 | queued_at | A | 5118685 | NULL | NULL | YES | BTREE | | |
| hunted_place_review_external_urls | 1 | worker_id_and_queued_at | 1 | worker_id | A | 5118685 | NULL | NULL | YES | BTREE | | |
| hunted_place_review_external_urls | 1 | worker_id_and_queued_at | 2 | queued_at | A | 5118685 | NULL | NULL | YES | BTREE | | |
| hunted_place_review_external_urls | 1 | place_id_source_id_external_url_successful | 1 | place_id | A | 5118685 | NULL | NULL | | BTREE | | |
| hunted_place_review_external_urls | 1 | place_id_source_id_external_url_successful | 2 | source_id | A | 5118685 | NULL | NULL | | BTREE | | |
| hunted_place_review_external_urls | 1 | place_id_source_id_external_url_successful | 3 | external_url | A | 5118685 | NULL | NULL | | BTREE | | |
| hunted_place_review_external_urls | 1 | place_id_source_id_external_url_successful | 4 | successful | A | 5118685 | NULL | NULL | | BTREE | | |
+-----------------------------------+------------+--------------------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
我的查询:
mysql> select count(*) from hunted_place_review_external_urls;
+----------+
| count(*) |
+----------+
| 4217356 |
+----------+
1 row in set (0.96 sec)
mysql> select count(*) from hunted_place_review_external_urls where worker_id is null;
+----------+
| count(*) |
+----------+
| 772626 |
+----------+
1 row in set (0.27 sec)
mysql> update hunted_place_review_external_urls set worker_id = "123" where worker_id is null order by queued_at asc limit 1;
Query OK, 1 row affected (4.80 sec)
Rows matched: 1 Changed: 1 Warnings: 0
为什么即使我在queued_at
和worker_id
上同时拥有单一索引和复合索引,更新查询也会占用4秒?当worker_id = null的行数低得多时,这种情况从未发生过。使用~20万行而不是780k行,只需几毫秒。
请注意,使用SELECT
代替UPDATE
的等效查询非常快:
mysql> select * from hunted_place_review_external_urls where worker_id is null order by queued_at asc limit 1;
1 row in set (0.00 sec)
我的queued_at
值是以毫秒数表示的时间戳,例如1398210069531
我尝试在worker_id
和queued_at
上删除我的单个索引,但问题仍然存在:
mysql> drop index queued_at on hunted_place_review_external_urls;
Query OK, 0 rows affected (3.75 sec)
mysql> drop index worker_id on hunted_place_review_external_urls;
Query OK, 0 rows affected (3.75 sec)
mysql> show index from hunted_place_review_external_urls;
+-----------------------------------+------------+--------------------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------------------------+------------+--------------------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| hunted_place_review_external_urls | 0 | PRIMARY | 1 | id | A | 5118685 | NULL | NULL | | BTREE | | |
| hunted_place_review_external_urls | 1 | worker_id_and_queued_at | 1 | worker_id | A | 5118685 | NULL | NULL | YES | BTREE | | |
| hunted_place_review_external_urls | 1 | worker_id_and_queued_at | 2 | queued_at | A | 5118685 | NULL | NULL | YES | BTREE | | |
| hunted_place_review_external_urls | 1 | place_id_source_id_external_url_successful | 1 | place_id | A | 5118685 | NULL | NULL | | BTREE | | |
| hunted_place_review_external_urls | 1 | place_id_source_id_external_url_successful | 2 | source_id | A | 5118685 | NULL | NULL | | BTREE | | |
| hunted_place_review_external_urls | 1 | place_id_source_id_external_url_successful | 3 | external_url | A | 5118685 | NULL | NULL | | BTREE | | |
| hunted_place_review_external_urls | 1 | place_id_source_id_external_url_successful | 4 | successful | A | 5118685 | NULL | NULL | | BTREE | | |
+-----------------------------------+------------+--------------------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
这是我的EXPLAIN SELECT
声明。我正在使用不支持EXPLAIN UPDATE
的旧版MYSQL:
mysql> explain select * from hunted_place_review_external_urls where worker_id is null order by queued_at asc limit 1;
+----+-------------+-----------------------------------+------+-------------------------+-------------------------+---------+-------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------------------------+------+-------------------------+-------------------------+---------+-------+---------+-------------+
| 1 | SIMPLE | hunted_place_review_external_urls | ref | worker_id_and_queued_at | worker_id_and_queued_at | 768 | const | 1587282 | Using where |
+----+-------------+-----------------------------------+------+-------------------------+-------------------------+---------+-------+---------+-------------+
1 row in set (0.00 sec)
答案 0 :(得分:0)
这是您的查询:
update hunted_place_review_external_urls
set worker_id = "123"
where worker_id is null
order by queued_at asc
limit 1;
首先必须找到用于更新的行,这需要应用where
子句和order by
子句。它可以完成所有工作(扫描表然后排序),也可以使用索引。正确的索引是hunted_place_review_external_urls(worker_id, queued_at)
。您可以在末尾添加更多列,但这些列必须是前两列,并按此顺序。
编辑:
鉴于select
速度很快,请尝试以下版本:
update hunted_place_review_external_urls toupdate join
(select
from hunted_place_review_external_urls
where worker_id is null
order by queued_at asc
limit 1
) l
on toupdate.id = l.id
set toupdate.worker_id = '123';
我不确定为什么这些索引会在update
中正确使用,但希望这会有效。
答案 1 :(得分:0)
比较Drop Index
,Create Index
和update
的时间。
您可能会注意到相关性。
当您执行简单SELECT
查询时,索引 USEFUL ,会导致速度加快。
当您执行UPDATE
或DELETE
语句时 - 索引很糟糕,导致速度变慢!每当您更改索引列的值时,MySQL都需要为任何后续行重建索引。 (假设您始终获取最旧的条目 - 这意味着:重新索引所有重新获取772625行。)
尝试删除worker_id
上的索引并查看更新效果。如果未对worker_id编制索引,则更新速度会更快。 (查找条目到更新仍然和以前一样快,因为它主要取决于对索引列queued_at
执行的排序以及非索引null
值的一小部分在worker_id
上,匹配所需的queued_at
值
我刚刚创建了一个虚拟数据库并测试了你的设置:
使用1.000.000行和两者 - worker_id
上的单个索引和worker_id|queued at
上的复合索引,选择如下:
SELECT * FROM `tbl` WHERE ISNULL( worker_ID ) ORDER BY queued_at ASC LIMIT 1
和表现:
Query took 0.3360 sec
尝试以您的方式修改worker_id
,结果如下:
UPDATE `tbl` SET worker_id=1 WHERE ISNULL(worker_ID) ORDER BY queued_at ASC LIMIT 1
性能:
1 row affected. (Query took 7.9592 sec)
删除worker_id
上的索引(单个和复合),然后会产生相同的查询:
1 row affected. (Query took 1.4364 sec)
(每个插入生成50.000行,因此它们具有相同的日期,因此索引不是&#34;完美&#34;用于搜索,因此&#34;真实&#34;数据可能表现得更好。)< / p>