我想模拟数据库中的大量数据,并测试我的查询在这种情况下的执行情况。当查询结果很慢时,我并不感到惊讶。所以我在这里寻求有关如何更好地索引表格和改进查询的建议。
在我发布表的sql和我使用的查询之前,让我解释一下是什么。我有一个用户的表,其中填充了100 000条记录。其中的大多数列都是枚举类型,如头发颜色,looking_for等...我在搜索完成时会生成第一个查询。查询将包含where语句,其中搜索某些或所有列值,并且仅检索由20限制的ID。
然后我还有3个表,每个用户拥有大约50-1000条记录,因此数字可能会真正增长。这些表格包含有关谁访问了谁的个人资料,谁标记了谁是谁,谁阻止谁以及消息传递表的信息。我的目标是检索20条符合搜索条件的记录,同时确定我(浏览的用户)是否有:
为此,我尝试使用连接和子查询,但问题是检索上面列出的用户和数据的第二个查询仍然很慢。我想我需要一个更好的索引,可能还有更好的查询。这就是我现在所拥有的,首先是表定义,最后是2个查询。首先解决并确定ID,第二次使用来自第一次查询的ID来检索数据。我希望你们能帮助我创建更好的索引并优化查询。
CREATE TABLE user (id BIGINT AUTO_INCREMENT, dname VARCHAR(255) NOT NULL, email VARCHAR(255) NOT NULL UNIQUE, email_code VARCHAR(255), email_confirmed TINYINT(1) DEFAULT '0', password VARCHAR(255) NOT NULL, gender ENUM('male', 'female'), description TEXT, dob DATE, height MEDIUMINT, looks ENUM('thin', 'average', 'athletic', 'heavy'), looking_for ENUM('marriage', 'dating', 'friends'), looking_for_age1 BIGINT, looking_for_age2 BIGINT, color_hair ENUM('black', 'brown', 'blond', 'red'), color_eyes ENUM('black', 'brown', 'blue', 'green', 'grey'), marital_status ENUM('single', 'married', 'divorced', 'widowed'), smokes ENUM('no', 'yes', 'sometimes'), drinks ENUM('no', 'yes', 'sometimes'), has_children ENUM('no', 'yes'), wants_children ENUM('no', 'yes'), education ENUM('school', 'college', 'university', 'masters', 'phd'), occupation ENUM('no', 'yes'), country_id BIGINT, city_id BIGINT, lastlogin_at DATETIME, deleted_at DATETIME, created_at DATETIME NOT NULL, updated_at DATETIME NOT NULL, INDEX country_id_idx (country_id), INDEX city_id_idx (city_id), INDEX image_id_idx (image_id), PRIMARY KEY(id)) DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci ENGINE = INNODB;
CREATE TABLE block (id BIGINT AUTO_INCREMENT, blocker_id BIGINT, blocked_id BIGINT, created_at DATETIME NOT NULL, updated_at DATETIME NOT NULL, INDEX blocker_id_idx (blocker_id), INDEX blocked_id_idx (blocked_id), PRIMARY KEY(id)) DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci ENGINE = INNODB;
CREATE TABLE city (id BIGINT AUTO_INCREMENT, name_eng VARCHAR(30), name_geo VARCHAR(30), name_geo_shi VARCHAR(30), name_geo_is VARCHAR(30), country_id BIGINT NOT NULL, active TINYINT(1) DEFAULT '0', INDEX country_id_idx (country_id), PRIMARY KEY(id)) DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci ENGINE = INNODB;
CREATE TABLE country (id BIGINT AUTO_INCREMENT, code VARCHAR(2), name_eng VARCHAR(30), name_geo VARCHAR(30), name_geo_shi VARCHAR(30), name_geo_is VARCHAR(30), active TINYINT(1) DEFAULT '1', PRIMARY KEY(id)) DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci ENGINE = INNODB;
CREATE TABLE favorite (id BIGINT AUTO_INCREMENT, favoriter_id BIGINT, favorited_id BIGINT, created_at DATETIME NOT NULL, updated_at DATETIME NOT NULL, INDEX favoriter_id_idx (favoriter_id), INDEX favorited_id_idx (favorited_id), PRIMARY KEY(id)) DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci ENGINE = INNODB;
CREATE TABLE message (id BIGINT AUTO_INCREMENT, body TEXT, sender_id BIGINT, receiver_id BIGINT, read_at DATETIME, created_at DATETIME NOT NULL, updated_at DATETIME NOT NULL, INDEX sender_id_idx (sender_id), INDEX receiver_id_idx (receiver_id), PRIMARY KEY(id)) DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci ENGINE = INNODB;
CREATE TABLE visitor (id BIGINT AUTO_INCREMENT, visitor_id BIGINT, visited_id BIGINT, created_at DATETIME NOT NULL, updated_at DATETIME NOT NULL, INDEX visitor_id_idx (visitor_id), INDEX visited_id_idx (visited_id), PRIMARY KEY(id)) DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci ENGINE = INNODB;
SELECT s.id AS s__id FROM user s WHERE (s.gender = 'female' AND s.marital_status = 'single' AND s.smokes = 'no' AND s.deleted_at IS NULL) LIMIT 20
SELECT s.id AS s__id, s.dname AS s__dname, s.gender AS s__gender, s.height AS s__height, s.dob AS s__dob, s3.id AS s3__id, s3.code AS s3__code, s3.name_geo AS s3__name_geo, s4.id AS s4__id, s4.name_geo AS s4__name_geo, s5.id AS s5__id, s6.id AS s6__id, s7.id AS s7__id, s8.id AS s8__id, s9.id AS s9__id FROM user s LEFT JOIN country s3 ON s.country_id = s3.id LEFT JOIN city s4 ON s.city_id = s4.id LEFT JOIN block s5 ON ((s.id = s5.blocked_id AND s5.blocker_id = '1')) LEFT JOIN favorite s6 ON ((s.id = s6.favorited_id AND s6.favoriter_id = '1')) LEFT JOIN favorite s7 ON ((s.id = s7.favoriter_id AND s7.favorited_id = '1')) LEFT JOIN message s8 ON ((s.id = s8.sender_id AND s8.receiver_id = '1' AND s8.read_at IS NULL)) LEFT JOIN message s9 ON (((s.id = s9.sender_id AND s9.receiver_id = '1') OR (s.id = s9.receiver_id AND s9.sender_id = '1'))) WHERE (s.id IN ('22', '36', '53', '105', '152', '156', '169', '182', '186', '192', '201', '215', '252', '287', '288', '321', '330', '351', '366', '399')) GROUP BY s.id ORDER BY s.id
以下是上述2个查询的EXPLAIN结果:
首先:
1 SIMPLE s ALL NULL NULL NULL NULL 100420 Using Where
第二
1 SIMPLE s range PRIMARY PRIMARY 8 NULL 20 Using where; Using temporary; Using filesort
1 SIMPLE s2 eq_ref PRIMARY PRIMARY 8 sagule.s.image_id 1 Using index
1 SIMPLE s3 eq_ref PRIMARY PRIMARY 8 sagule.s.country_id 1
1 SIMPLE s4 eq_ref PRIMARY PRIMARY 8 sagule.s.city_id 1
1 SIMPLE s5 ref blocker_id_idx,blocked_id_idx blocked_id_idx 9 sagule.s.id 5
1 SIMPLE s6 ref favoriter_id_idx,favorited_id_idx favorited_id_idx 9 sagule.s.id 6
1 SIMPLE s7 ref favoriter_id_idx,favorited_id_idx favoriter_id_idx 9 sagule.s.id 6
1 SIMPLE s8 ref sender_id_idx,receiver_id_idx sender_id_idx 9 sagule.s.id 7
1 SIMPLE s9 index_merge sender_id_idx,receiver_id_idx receiver_id_idx,sender_id_idx 9,9 NULL 66 Using union(receiver_id_idx,sender_id_idx); Using where
答案 0 :(得分:3)
答案 1 :(得分:1)
在第二个SELECT查询中,您可以删除GROUP BY子句,因为您没有在SELECT子句中使用任何聚合函数(count,min,max ...)。
我怀疑这会有助于提高性能。
无论如何,我建议观看本演讲的前半部分“看看MySQL DBA的工具箱”。 (视频的前三分之二是关于Unix上mysql的免费开源管理工具,最后三分之一是关于复制)
答案 2 :(得分:1)
如果没有一些数据可供测试,那么提出一个好的建议并不容易。
为经常搜索的字段生成索引可以帮助您更快地查询。但是使用索引,您的插入和更新可能会变慢。你必须考虑权衡。因此,索引频繁搜索的列,但测试数据上的新索引,以便查看它是否运行得更快。
我不知道您使用的是哪种工具,但使用MySQL Workbench时,在“查询”-Menu下有一个命令“解释当前语句”。在那里,您可以看到MySQL执行了哪些操作以及使用了哪些键。您的查询显示“null”,这意味着没有使用密钥,MySQL必须与搜索词进行比较才能运行整个数据。
希望这有点帮助。