我有一个看起来像这样的MariaDB表:
+--------+--------+--------+---------------------+
| realm | key2 | userId | date |
+--------+--------+--------+---------------------+
| AB3 | 123 | 1 | 2017-08-04 17:30:00 |
| AB3 | 124 | 1 | 2017-08-04 17:30:00 |
| AB3 | 125 | 1 | 2017-08-04 17:30:00 |
| XY7 | 97 | 2 | 2017-08-04 17:35:00 |
| XY7 | 98 | 2 | 2017-08-04 17:35:00 |
| XY7 | 99 | 2 | 2017-08-04 17:35:00 |
| AB3 | 110 | 3 | 2017-08-04 17:40:00 |
| AB3 | 111 | 3 | 2017-08-04 17:40:00 |
+--------+--------+--------+---------------------+
PRIMARY_KEY (realm, key2)
INDEX (realm, userId)
INDEX (date)
此表用作处理用户操作的某种队列。基本上,服务器始终从该表中获取最旧的数据,对其进行处理并从该表中删除它。每个领域都有自己的服务器来处理这个队列。
现在我想找出用户在该领域的队列中的位置。因此,使用上面的示例,当我在领域'AB3'中请求userId 3的位置时,我想得到结果2
,因为只有一个其他用户(userId 1)将在之前为领域AB3处理。
(行key2
可能在此示例中无关紧要。我只包含它,因为它是主键的一部分,可能使其与找到一个好的解决方案相关)
这是SQL架构:
CREATE TABLE `queue` (
`realm` varchar(5) NOT NULL,
`key2` int(10) UNSIGNED NOT NULL,
`userId` int(10) UNSIGNED NOT NULL,
`date` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
INSERT INTO `queue` (`realm`, `key2`, `userId`, `date`) VALUES
('AB3', 110, 3, '2017-08-04 17:40:00'),
('AB3', 111, 3, '2017-08-04 17:40:00'),
('AB3', 123, 1, '2017-08-04 17:30:00'),
('AB3', 124, 1, '2017-08-04 17:30:00'),
('AB3', 125, 1, '2017-08-04 17:30:00'),
('XY7', 97, 2, '2017-08-04 17:35:00'),
('XY7', 98, 2, '2017-08-04 17:35:00'),
('XY7', 99, 2, '2017-08-04 17:35:00');
ALTER TABLE `queue`
ADD PRIMARY KEY (`realm`,`key2`),
ADD KEY `ru` (`realm`,`userId`) USING BTREE,
ADD KEY `date` (`date`);
我提出了这个看起来有效的查询,但在一个包含10,000,000个条目的表格上很慢(~3秒):
SELECT (COUNT(DISTINCT `realm`, `userId`)+1) `position`
FROM `queue`
WHERE `realm` = 'AB3'
AND `date` < (
SELECT `date`
FROM `queue`
WHERE `realm` = 'AB3' AND `userId` = 3
GROUP BY `realm`, `userId`
)
SQL小提琴:http://sqlfiddle.com/#!9/fb04fd/9/0
EXPLAIN EXTENDED
此查询:
+----+-------------+-------+-------------+-----------------+------------+---------+-------+---------+----------+------------------------------------------+--+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | |
+----+-------------+-------+-------------+-----------------+------------+---------+-------+---------+----------+------------------------------------------+--+
| 1 | PRIMARY | queue | ref | PRIMARY,ru,date | PRIMARY | 767 | const | 5266123 | 100.00 | Using where | |
| 2 | SUBQUERY | queue | index_merge | PRIMARY,ru | ru,PRIMARY | 771,767 | | 496 | 75.00 | Using intersect(ru,PRIMARY); Using where | |
+----+-------------+-------+-------------+-----------------+------------+---------+-------+---------+----------+------------------------------------------+--+
您是否有任何想法如何优化此查询以便在具有10,000,000个条目的表上更快地运行?
在此表上运行的其他查询:
SELECT `m`.*
FROM `queue` `m`
JOIN (
SELECT `m`.*
FROM `queue` `m`
WHERE `m`.`realm` = ?
ORDER BY `date` ASC
LIMIT 1
) `mm` ON `m`.`realm` = `mm`.`realm` AND `m`.`userId` = `mm`.`userId`;
和
DELETE FROM `queue` WHERE `realm` = ? AND `userId` = ?;
我如何优化索引?
答案 0 :(得分:2)
我觉得桌子DDL有问题。无论如何,我会重写您的查询,如:
SELECT (COUNT(DISTINCT `userId`)+1) `position`
FROM `queue`
WHERE `realm` = 'AB3'
AND `date` < (
SELECT min(`date`)
FROM `queue`
WHERE `realm` = 'AB3' AND `userId` = 3
)
并且可能对此查询有一个非常具体的索引,如:
index (realm, date)
您可以尝试安全指数
index (realm, date, userId)
但不确定它会比前一个更快。