我一直在努力使用mysql连接,但是尽管已经阅读了数十篇教程和mysql手册,但我们已经开始整合更多但很难理解。
我的情况是我有3张桌子:
/ *基本上是一个描述风扇记录的表* /
CREATE TABLE `fans` ( `id` int(11) unsigned NOT NULL AUTO_INCREMENT, `first_name` varchar(255) DEFAULT NULL, `middle_name` varchar(255) DEFAULT NULL, `last_name` varchar(255) DEFAULT NULL, `email` varchar(255) DEFAULT NULL, `join_date` datetime DEFAULT NULL, `twitter` varchar(255) DEFAULT NULL, `twitterCrawled` datetime DEFAULT NULL, `twitterImage` varchar(255) DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `email` (`email`) ) ENGINE=MyISAM AUTO_INCREMENT=20413 DEFAULT CHARSET=latin1; /* A TABLE OF OUR TWITTER FOLLOWERS */ CREATE TABLE `twitterFollowers` ( `id` int(11) unsigned NOT NULL AUTO_INCREMENT, `screenName` varchar(25) DEFAULT NULL, `twitterId` varchar(25) DEFAULT NULL, `customerId` int(11) DEFAULT NULL, `uniqueStr` varchar(50) DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `unique` (`uniqueStr`) ) ENGINE=InnoDB AUTO_INCREMENT=13426 DEFAULT CHARSET=utf8; /* TABLE THAT SUGGESTS A LIKELY MATCH OF A TWITTER FOLLOWER BASED ON THE EMAIL / SCREEN NAME COMPARISON OF THE FAN vs OUR FOLLOWERS IF SOMEONE (ie. a moderator) CONFIRMS OR DENIES THAT IT'S A GOOD MATCH THEY PUT A DATESTAMP IN `dismissed` */ CREATE TABLE `contentSuggestion` ( `id` int(11) unsigned NOT NULL AUTO_INCREMENT, `userId` int(11) DEFAULT NULL, `fanId` int(11) DEFAULT NULL, `twitterAccountId` int(11) DEFAULT NULL, `contentType` varchar(50) DEFAULT NULL, `contentString` varchar(255) DEFAULT NULL, `added` datetime DEFAULT NULL, `dismissed` datetime DEFAULT NULL, `uniqueStr` varchar(255) DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `unstr` (`uniqueStr`) ) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8;
我想要的是:
SELECT [粉丝专栏] 粉丝屏幕名称是在Twitter追随者 AND WHERE粉丝屏幕名称不在contentSuggestion(已取消日期戳)
My attempts so far:
~33秒
SELECT fans.id,tf.screenName as col1,tf.twitterId as col2 FROM fans LEFT JOIN twitterFollowers tf ON tf.screenName = fans.emailUsername LEFT JOIN contentSuggestion cs ON cs.contentString = tf.screenName WHERE dismissed IS NULL GROUP BY(fans.id)有col1!=''
~14秒
SELECT id,emailUsername FROM fans WHERE emailUsername IN(SELECT DISTINCT(screenName)FROM twitterFollowers)AND emailUsername NOT IN(SELECT DISTINCT(contentString)FROM contentSuggestion WHERE dismissed IS NULL)GROUP BY(fans.id);
9.53秒
SELECT fans.id,tf.screenName as col1,tf.twitterId as col2 FROM fans LEFT JOIN twitterFollowers tf ON tf.screenName = fans.emailUsername WHERE tf.uniqueStr NOT IN(SELECT uniqueStr FROM contentSuggestion WHERE dismissed IS NULL)
我希望有更好的方法。我一直在努力在单个LEFT JOIN之外真正使用JOINS,这已经帮助我大大加快了其他查询的速度。
感谢您提供任何帮助。
答案 0 :(得分:0)
我会选择第二种方法的变体。而不是IN
,请使用EXISTS
。然后添加正确的索引并删除聚合:
SELECT f.id, f.emailUsername
FROM fans f
WHERE EXISTS (SELECT 1
FROM twitterFollowers tf
WHERE f.emailUsername = tf.screenName
) AND
NOT EXISTS (SELECT 1
FROM contentSuggestion cs
WHERE f.emailUsername = cs.contentString AND
cs.dismissed IS NULL
) ;
然后确保您拥有以下索引:twitterFollowers(screenName)
和contentSuggestion(contentString, dismissed)
。
一些注意事项:
IN
时,请勿使用SELECT DISTINCT
。我并不是100%确定MySQL总是足够智能忽略子查询中的DISTINCT
(它是多余的)。EXISTS
比MySQL中的IN
快。优化器在最近的版本中有所改进。twitterFollowers(screenName)
和contentSuggestion(contentString, dismissed)
。fan.id
是唯一的(一个非常合理的假设),您不需要最终的group by
。