这可能是历史上发布给SO的最长查询。 从本质上讲,我想知道这是否是正确的想法,需要做我想要的查询,并且鉴于它非常慢,我将如何加快速度。
我一直在阅读关于数据规范化等问题,也许这太规范化了?鉴于用户可以拥有数十万个项目,我需要以最快的方式执行此操作,并进行扩展。
我有六张桌子
--
-- Table structure for table `emails`
--
CREATE TABLE IF NOT EXISTS `emails` (
`ID` int(5) NOT NULL AUTO_INCREMENT,
`email` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`sent` smallint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`ID`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=5062 ;
-- --------------------------------------------------------
--
-- Table structure for table `ips`
--
CREATE TABLE IF NOT EXISTS `ips` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`ip` varchar(255) NOT NULL,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=7534 ;
-- --------------------------------------------------------
--
-- Table structure for table `user_items`
--
CREATE TABLE IF NOT EXISTS `user_items` (
`ID` int(10) NOT NULL AUTO_INCREMENT COMMENT 'Allows sorting by last added..',
`name` varchar(255) NOT NULL,
`owner` varchar(255) NOT NULL,
`folder` int(10) NOT NULL,
`version` int(5) NOT NULL,
PRIMARY KEY (`ID`),
KEY `name` (`name`),
KEY `folder` (`folder`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=10431 ;
-- --------------------------------------------------------
--
-- Table structure for table `data`
--
CREATE TABLE IF NOT EXISTS `data` (
`ID` int(10) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
`info` varchar(255) DEFAULT NULL,
`inserted` varchar(255) NOT NULL,
`version` int(5) NOT NULL,
PRIMARY KEY (`ID`),
KEY `inserted` (`inserted`),
KEY `version` (`version`),
KEY `name_version` (`name`,`version`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=7207 ;
-- --------------------------------------------------------
--
-- Table structure for table `data_emails`
--
CREATE TABLE IF NOT EXISTS `data_emails` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`email_id` int(5) NOT NULL,
`name` varchar(255) NOT NULL,
`version` int(5) NOT NULL,
`time` int(255) NOT NULL,
PRIMARY KEY (`ID`),
KEY `version` (`version`),
KEY `name_version` (`name`,`version`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=9849 ;
-- --------------------------------------------------------
--
-- Table structure for table `data_ips`
--
CREATE TABLE IF NOT EXISTS `data_ips` (
`ID` int(5) NOT NULL AUTO_INCREMENT,
`ns_id` int(5) NOT NULL,
`name` varchar(255) NOT NULL,
`version` int(5) NOT NULL,
`time` int(255) NOT NULL,
PRIMARY KEY (`ID`),
KEY `version` (`version`),
KEY `name_version` (`name`,`version`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=17988 ;
我需要实现的目标如下。我需要获取每个user_item并获取与之关联的数据,电子邮件和ips。 user_items按列“名称”和版本链接到data,data_emails和data_ips。
data_emails和data_nameservers分别使用email_id / ip_id = ID链接到电子邮件和ips
因为多个连接导致行乘法,所以我不得不使用嵌套子查询。因为用户可以拥有多个与项目相关联的电子邮件和ips,所以我使用了group_concat来对所有特定行进行分组。然后我在我的PHP中爆炸这个列 - 这本身似乎效率低下,但我看不到其他方法吗?
我玩了索引,但由于连接复杂,而且我对此很新,我不确定索引是什么。有人可以建议索引并解释它们吗?
SELECT user_items.ID, user_items.name, user_items.update,data.*, x.emails,
x.e_status, y.ip, x.email_counts, y.ip_counts
FROM user_items
LEFT JOIN data AS data
ON (data.name = user_items.name AND data.version = user_items.version)
LEFT JOIN (
SELECT data_emails.name, data_emails.version,
GROUP_CONCAT( b.email SEPARATOR ',' ) as emails,
GROUP_CONCAT( b.sent SEPARATOR ',' ) as e_status,
GROUP_CONCAT( b.email_count SEPARATOR ',' ) as email_counts
FROM data_emails
LEFT JOIN (
SELECT emails.ID, emails.email, emails.sent, data_emails.name,
data_emails.version, data_emails.email_id,
COUNT(data_emails.ID) as email_count
FROM data_emails
LEFT JOIN emails ON (data_emails.email_id = emails.ID)
GROUP BY data_emails.email_id
) b ON (data_emails.email_id = b.ID)
GROUP BY data_emails.name
) x ON (data.name = x.name AND x.version = user_items.version)
LEFT JOIN (
SELECT data_ips.name,data_ips.version,
GROUP_CONCAT( c.ip SEPARATOR ',' ) as ip,
GROUP_CONCAT( c.ips_count SEPARATOR ',' ) as ip_counts
FROM data_ips
LEFT JOIN (
SELECT ips.ID, ips.ip, data_ips.name, data_ips.version,
data_ips.ip_id, COUNT(data_ips.ID) as ips_count
FROM data_ips
LEFT JOIN ips ON (data_ips.ip_id = ips.ID)
GROUP BY data_ips.ip_id
) c ON (data_ips.ip_id = c.ID)
GROUP BY data_ips.name
) y ON (data.name = y.name AND y.version = user_items.version)
WHERE user_items.folder = '2'
GROUP BY user_items.name
为完整起见,查询的解释输出如下:
+----+-------------+-------------+--------+----------------------+--------------+---------+--------------------------------------------+-------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------------+--------+----------------------+--------------+---------+--------------------------------------------+-------+---------------------------------+ | 1 | PRIMARY | user_items | ref | folder | folder | 4 | const | 1139 | Using temporary; Using filesort | | 1 | PRIMARY | data | ref | version,name_version | name_version | 261 | gua.user_items.name,gua.user_items.version | 1 | | | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 5591 | | | 1 | PRIMARY | <derived4> | ALL | NULL | NULL | NULL | NULL | 5443 | | | 4 | DERIVED | data_ips | ALL | NULL | NULL | NULL | NULL | 16301 | Using temporary; Using filesort | | 4 | DERIVED | <derived5> | ALL | NULL | NULL | NULL | NULL | 7533 | | | 5 | DERIVED | data_ips | ALL | NULL | NULL | NULL | NULL | 16301 | Using temporary; Using filesort | | 5 | DERIVED | ips | eq_ref | PRIMARY | PRIMARY | 4 | gua.data_ips.ns_id | 1 | | | 2 | DERIVED | data_emails | ALL | NULL | NULL | NULL | NULL | 10138 | Using temporary; Using filesort | | 2 | DERIVED | <derived3> | ALL | NULL | NULL | NULL | NULL | 5061 | | | 3 | DERIVED | data_emails | ALL | NULL | NULL | NULL | NULL | 10138 | Using temporary; Using filesort | | 3 | DERIVED | emails | eq_ref | PRIMARY | PRIMARY | 4 | gua.data_emails.email_id | 1 | | +----+-------------+-------------+--------+----------------------+--------------+---------+--------------------------------------------+-------+---------------------------------+