我有一个很大的InnoDB表,此时包含大约2000万行,每天插入约20000个新行。它们包含不同主题的消息。
CREATE TABLE IF NOT EXISTS `Messages` (
`ID` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`TopicID` bigint(20) unsigned NOT NULL,
`DATESTAMP` int(11) DEFAULT NULL,
`TIMESTAMP` int(10) unsigned NOT NULL,
`Message` mediumtext NOT NULL,
`Checksum` varchar(50) DEFAULT NULL,
`Nickname` varchar(80) NOT NULL,
PRIMARY KEY (`ID`),
UNIQUE KEY `TopicID` (`TopicID`,`Checksum`),
KEY `DATESTAMP` (`DATESTAMP`),
KEY `Nickname` (`Nickname`),
KEY `TIMESTAMP` (`TIMESTAMP`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=25195126 ;
注意:Cheksum存储MD5校验和,该校验和可防止在相同主题中插入两次相同的消息。 (昵称+时间戳+主题词+最后20个字符的消息)
我正在构建的网站有一个新闻源,用户可以选择查看来自不同论坛的不同昵称的最新消息。查询如下:
SELECT
Messages.ID AS MessageID,
Messages.Message,
Messages.TIMESTAMP,
Messages.Nickname,
Topics.ID AS TopicID,
Topics.Title AS TopicTitle,
Forums.Title AS ForumTitle
FROM Messages
JOIN FollowedNicknames ON FollowedNicknames.UserID = 'MYUSERID'
JOIN Forums ON Forums.ID = FollowedNicknames.ForumID
JOIN Subforums ON Subforums.ForumID = Forums.ID
JOIN Topics ON Topics.SubforumID = Subforums.ID
WHERE
Messages.Nickname = FollowedNicknames.Nickname AND
Messages.TopicID = Topics.ID AND Messages.DATESTAMP = '2013619'
ORDER BY Messages.TIMESTAMP DESC
TIMESTAMP包含一个unix时间戳,DATESTAMP只是一个从unix时间戳生成的日期,可以通过'='运算符而不是带有unix时间戳的范围扫描更快地访问。
问题是,此查询大约需要13秒(或更多)无缓冲。这对于用意而言当然是不可接受的。添加DATESTAMP似乎可以加快速度,但不是很多。
此时,我真的不知道该怎么办。我已经阅读了有关复合主键的内容,但我仍然不确定它们是否会有任何好处以及如何在这种特殊情况下正确实现它。
我知道使用BIGINT可能有点矫枉过正,但它们会影响那么多吗?
说明:
+----+-------------+-----------------------+--------+---------------------------------------+------------+---------+-----------------------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------------+--------+---------------------------------------+------------+---------+-----------------------------------------------+------+----------------------------------------------+
| 1 | SIMPLE | FollowedNicknames | ALL | UserID,ForumID,Nickname | NULL | NULL | NULL | 8 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | Forums | eq_ref | PRIMARY | PRIMARY | 8 | database.FollowedNicknames.ForumiID | 1 | NULL |
| 1 | SIMPLE | Messages | ref | TopicID,DATETIME,Nickname | Nickname | 242 | database.FollowedNicknames.Nickname | 15 | Using where |
| 1 | SIMPLE | Topics | eq_ref | PRIMARY,SubforumID | PRIMARY | 8 | database.Messages.TopicID | 1 | NULL |
| 1 | SIMPLE | Subforums | eq_ref | PRIMARY,ForumID | PRIMARY | 8 | database.Topics.SubforumID | 1 | Using where |
+----+-------------+-----------------------+--------+---------------------------------------+------------+---------+-----------------------------------------------+------+----------------------------------------------+
答案 0 :(得分:0)
您不应JOIN
列VARCHAR
列Nickname
;您应该使用用户ID来加入这些表。这肯定会减慢查询速度,可能是最大的问题。如果您在JOIN
子句中明确地编写了所有WHERE
而不是在SELECT
Messages.ID AS MessageID,
Messages.Message,
Messages.TIMESTAMP,
Messages.Nickname,
Topics.ID AS TopicID,
Topics.Title AS TopicTitle,
Forums.Title AS ForumTitle
FROM Messages
JOIN FollowedNicknames ON Messages.Nickname = FollowedNicknames.Nickname
AND FollowedNicknames.UserID = 'MYUSERID'
JOIN Forums ON Forums.ID = FollowedNicknames.ForumID
JOIN Subforums ON Subforums.ForumID = Forums.ID
JOIN Topics ON Messages.TopicID = Topics.ID
AND Topics.SubforumID = Subforums.ID
WHERE Messages.DATESTAMP = '2013619'
ORDER BY Messages.TIMESTAMP DESC
子句中,那么也会更容易理解:
INT
我会使用DATESTAMP
而不是DATE
作为Checksum
列的数据类型。 latin1_general_ci
列可能应使用INT
作为排序规则。我会使用INT UNSIGNED
作为ID列,只要它们的值小于2,000,000,000,因为{{1}}可以存储大约4,000,000,000的值。 InnoDB比MyISAM更受主键的影响,它可以产生显着的差异。