FIND_IN_SET还是使用JOIN?

时间:2019-05-13 03:11:32

标签: mysql join find-in-set

我有一个文章表,它可以与许多类别关联。

这是具有FIND_IN_SET的解决方案1:

 CREATE TABLE `article` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `cat` varchar(512) DEFAULT NULL,
  `created_at` timestamp NULL DEFAULT NULL,
  `updated_at` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5090104 DEFAULT CHARSET=utf8mb4

这是带有表连接的解决方案2:

 CREATE TABLE `article_2` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `created_at` timestamp NULL DEFAULT NULL,
  `updated_at` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5511464 DEFAULT CHARSET=utf8mb4 


 CREATE TABLE `article_cat` (
  `cat` int(10) unsigned NOT NULL,
  `article_id` int(10) unsigned NOT NULL,
  `created_at` timestamp NULL DEFAULT NULL,
  `updated_at` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`cat`,`article_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 

然后,这是一些查询比较:

mysql> select count(1) from article;
+----------+
| count(1) |
+----------+
|  5207355 |
+----------+
1 row in set (1.12 sec)

mysql> select count(1) from article_2;
+----------+
| count(1) |
+----------+
|  5207313 |
+----------+
1 row in set (0.74 sec)

mysql> select count(1) from article_cat;
+----------+
| count(1) |
+----------+
| 20589375 |
+----------+
1 row in set (2.85 sec)


mysql> select count(1) from `article` where FIND_IN_SET('2',`article`.`cat`);
+----------+
| count(1) |
+----------+
|  2157053 |
+----------+
1 row in set (1.22 sec)

mysql> 
mysql> select count(1) from `article_2` u INNER JOIN article_cat c ON c.article_id = u.id WHERE c.cat =2;
+----------+
| count(1) |
+----------+
|  2156622 |
+----------+
1 row in set (13.72 sec)


mysql> select * from `article` where FIND_IN_SET('2',`article`.`cat`) limit 2000000,10;
+---------+-----------------+---------------------+---------------------+
| id      | cat             | created_at          | updated_at          |
+---------+-----------------+---------------------+---------------------+
| 5132154 | 2,5,7           | 2019-05-13 15:24:58 | 2019-05-13 15:24:58 |
| 5132155 | 1,2,3,5,6,8,9   | 2019-05-13 15:24:58 | 2019-05-13 15:24:58 |
| 5132156 | 1,2,6,7         | 2019-05-13 15:24:58 | 2019-05-13 15:24:58 |
| 5132157 | 1,2,3,5,9       | 2019-05-13 15:24:58 | 2019-05-13 15:24:58 |
| 5132158 | 1,2,3,4,6,7,9   | 2019-05-13 15:24:58 | 2019-05-13 15:24:58 |
| 5132159 | 1,2,5,6,7       | 2019-05-13 15:24:58 | 2019-05-13 15:24:58 |
| 5132160 | 2,3,4,5,7,9     | 2019-05-13 15:24:58 | 2019-05-13 15:24:58 |
| 5132161 | 1,2,5,9         | 2019-05-13 15:24:58 | 2019-05-13 15:24:58 |
| 5132164 | 1,2,3,4,6,7,8,9 | 2019-05-13 15:24:58 | 2019-05-13 15:24:58 |
| 5132166 | 1,2,3,4,5,8     | 2019-05-13 15:24:58 | 2019-05-13 15:24:58 |
+---------+-----------------+---------------------+---------------------+
10 rows in set (1.28 sec)

mysql> select * from `article_2` u INNER JOIN article_cat c ON c.article_id = u.id WHERE c.cat =2 limit 2000000,10 ;
+---------+---------------------+---------------------+-----+------------+---------------------+---------------------+
| id      | created_at          | updated_at          | cat | article_id | created_at          | updated_at          |
+---------+---------------------+---------------------+-----+------------+---------------------+---------------------+
| 5133109 | 2019-05-13 15:25:01 | 2019-05-13 15:25:01 |   2 |    5133109 | 2019-05-13 15:25:01 | 2019-05-13 15:25:01 |
| 5133110 | 2019-05-13 15:25:01 | 2019-05-13 15:25:01 |   2 |    5133110 | 2019-05-13 15:25:01 | 2019-05-13 15:25:01 |
| 5133113 | 2019-05-13 15:25:01 | 2019-05-13 15:25:01 |   2 |    5133113 | 2019-05-13 15:25:01 | 2019-05-13 15:25:01 |
| 5133116 | 2019-05-13 15:25:01 | 2019-05-13 15:25:01 |   2 |    5133116 | 2019-05-13 15:25:01 | 2019-05-13 15:25:01 |
| 5133117 | 2019-05-13 15:25:01 | 2019-05-13 15:25:01 |   2 |    5133117 | 2019-05-13 15:25:01 | 2019-05-13 15:25:01 |
| 5133120 | 2019-05-13 15:25:01 | 2019-05-13 15:25:01 |   2 |    5133120 | 2019-05-13 15:25:01 | 2019-05-13 15:25:01 |
| 5133124 | 2019-05-13 15:25:01 | 2019-05-13 15:25:01 |   2 |    5133124 | 2019-05-13 15:25:01 | 2019-05-13 15:25:01 |
| 5133133 | 2019-05-13 15:25:01 | 2019-05-13 15:25:01 |   2 |    5133133 | 2019-05-13 15:25:01 | 2019-05-13 15:25:01 |
| 5133137 | 2019-05-13 15:25:01 | 2019-05-13 15:25:01 |   2 |    5133137 | 2019-05-13 15:25:01 | 2019-05-13 15:25:01 |
| 5133138 | 2019-05-13 15:25:01 | 2019-05-13 15:25:01 |   2 |    5133138 | 2019-05-13 15:25:01 | 2019-05-13 15:25:01 |
+---------+---------------------+---------------------+-----+------------+---------------------+---------------------+
10 rows in set (14.01 sec)

据我所知,FIND_IN_SET对列进行了全面扫描,它没有从索引中受益,但是在这里我看到它的性能比连接表解决方案好,这是正常的吗?

在表项目大小增加到10000000+的情况下使用FIND_IN_SET是个好主意吗?如果不是,那么什么时候使用FIND_IN_SET更好?

更新:

FIND_IN_SET在某些情况下的性能更好,唯一的问题是关系数据库的设计风格不是问题。

1 个答案:

答案 0 :(得分:0)

联接性能很慢,因为您正在索引中查找很多行。并重新查找它们以获得时间戳和其他字段(不在索引中。

article_cat的建议。

  1. 不需要ID
  2. catarticle_id是很好的主键,尤其是对于此查询。如果对article_ids的查询更多,则交换订单并添加catarticle_id作为UNIQUE KEY

删除created_atupdated_at,因为它们在商品表中,因此它们将是重复项,并且会降低JOIN查询的速度。