在MySQL中,执行一个JOIN +一个LIKE语句或两个JOIN会更快吗?

时间:2017-04-24 14:15:11

标签: mysql performance join sql-like

我必须创建一个cron作业,这本身很简单,但因为它会每分钟运行一次,我担心性能。我有两个表,一个有用户名,另一个有关于他们网络的详细信息。大多数时候用户只属于一个网络,但理论上它们可能属于更多,但即便如此,也可能只有两个或三个。因此,为了减少JOIN的数量,我在用户表的字段中保存了由|分隔的网络ID,例如。

| 1 | 3 | 9 |

(此问题已简化)用户表结构是

TABLE `users` (
  `u_id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT UNIQUE,
  `userid` VARCHAR(500) NOT NULL UNIQUE,
  `net_ids` VARCHAR(500) NOT NULL DEFAULT '',
  PRIMARY KEY (`u_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

(简化的)网络表结构是

CREATE TABLE `network` (
  `n_id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT UNIQUE,
  `netname` VARCHAR(500) NOT NULL UNIQUE,
  `login_time` DATETIME DEFAULT NULL,
  `timeout_mins` TINYINT UNSIGNED NOT NULL DEFAULT 10,
  PRIMARY KEY (`n_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

发生超时时我必须发出警告,我的查询是

SELECT N.netname, N.timeout_mins, N.n_id, U.userid FROM
(SELECT netname, timeout_mins, n_id FROM network
 WHERE is_open = 1 AND notify = 1
 AND TIMESTAMPDIFF(SECOND, TIMESTAMPADD(MINUTE, timeout_mins, login_time), NOW()) < 60) AS N
INNER JOIN users AS U ON U.net_ids LIKE CONCAT('%|', N.n_id, '|%');

我使N成为子查询以减少连接的行数。但是我想知道用u_id和n_id作为列添加第三个表是否更快,从用户中删除net_ids列然后在所有三个表上进行连接?因为我读到使用LIKE可以减慢速度。

在这种情况下,哪种查询最有效?一个JOIN和LIKE或两个JOINS?

P.S。我做了一些实验,使用两个JOINS的初始值高于使用JOIN和LIKE。然而,重复运行相同的查询似乎加速了很多事情,我怀疑某些东西被缓存在某个地方,无论是在我的应用程序还是数据库中,并且两者都变得可比,所以我没有发现这些数据令人满意。根据我一直在阅读的内容,它也与我的预期相矛盾。

我用过这张表:

TABLE `user_net` (
`u_id` BIGINT UNSIGNED NOT NULL,
`n_id` BIGINT UNSIGNED NOT NULL,
INDEX `u_id` (`u_id`),
FOREIGN KEY (`u_id`) REFERENCES `users`(`u_id`),
INDEX `n_id` (`n_id`),
FOREIGN KEY (`n_id`) REFERENCES `network`(`n_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

和这个查询:

SELECT N.netname, N.timeout_mins, N.n_id, U.userid FROM
(SELECT netname, timeout_mins, n_id FROM network
 WHERE is_open = 1 AND notify = 1
 AND TIMESTAMPDIFF(SECOND, TIMESTAMPADD(MINUTE, timeout_mins, login_time), NOW()) < 60) AS N
INNER JOIN user_net AS UN ON N.n_id = UN.n_id
INNER JOIN users AS U ON UN.u_id = U.u_id;

2 个答案:

答案 0 :(得分:1)

You should define composite indexes for the user_net table. One of them can (and should) be the primary key.

TABLE `user_net` (
    `u_id` BIGINT UNSIGNED NOT NULL,
    `n_id` BIGINT UNSIGNED NOT NULL,
    PRIMARY KEY (`u_id`, `n_id`),
    INDEX `uid_nid` (`n_id`, `u_id`),
    FOREIGN KEY (`u_id`) REFERENCES `users`(`u_id`),
    FOREIGN KEY (`n_id`) REFERENCES `network`(`n_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

I would also rewrite your query to:

SELECT N.netname, N.timeout_mins, N.n_id, U.userid
FROM network N
INNER JOIN user_net AS UN ON N.n_id = UN.n_id
INNER JOIN users AS U  ON UN.u_id   = U.u_id
WHERE N.is_open = 1 
  AND N.notify = 1
  AND TIMESTAMPDIFF(SECOND, TIMESTAMPADD(MINUTE, N.timeout_mins, N.login_time), NOW()) < 60

While your subquery will probably not hurt much, there is no need for it.

Note that the last condition cannot use an index, because you have to combine two columns. If your MySQL version is at least 5.7.6 you can define an indexed virtual (calculated) column.

CREATE TABLE `network` (
  `n_id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT UNIQUE,
  `netname` VARCHAR(500) NOT NULL UNIQUE,
  `login_time` DATETIME DEFAULT NULL,
  `timeout_mins` TINYINT UNSIGNED NOT NULL DEFAULT 10,
  `is_open` TINYINT UNSIGNED,
  `notify`  TINYINT UNSIGNED,
  `timeout_dt` DATETIME AS (`login_time` + INTERVAL `timeout_mins` MINUTE),
  PRIMARY KEY (`n_id`),
  INDEX (`timeout_dt`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Now change the query to:

SELECT N.netname, N.timeout_mins, N.n_id, U.userid
FROM network N
INNER JOIN user_net AS UN ON N.n_id = UN.n_id
INNER JOIN users AS U  ON UN.u_id   = U.u_id
WHERE N.is_open = 1 
  AND N.notify  = 1
  AND N.timeout_dt < NOW() + INTERVAL 60 SECOND

and it will be able to use the index.

You can also try to replace

INDEX (`timeout_dt`)

with

INDEX (`is_open`, `notify`, `timeout_dt`)

and see if it is of any help.

答案 1 :(得分:0)

重新制定以避免将列隐藏在函数内部。我无法理解您的日期表达,但请注意:

login_time < NOW() - INTERVAL timeout_mins MINUTE

如果你能做到这样的话,那么这个索引应该有所帮助:

INDEX(is_open, notify, login_time)

如果这还不够好,让我们看看另一个配方,以便我们对它们进行比较。

用逗号(或|)分隔内容可能是一个非常糟糕的主意。

结论:假设JOINs不是性能问题,请根据需要使用尽可能多的JOINs编写查询。 然后让我们优化