考虑以下表格:
[Table: talks]
talkID | title | starred
-------+--------------+--------
1 | talk1-title | 1
2 | talk2-title | 1
3 | talk3-title | 0
4 | talk4-title | 0
5 | talk5-title | 0
[Table: talkspeaker]
talkID | speaker
-------+---------
1 | Speaker1
1 | Speaker2
2 | Speaker3
3 | Speaker4
3 | Speaker5
4 | Speaker6
5 | Speaker7
5 | Speaker8
[Table: similartalks]
talkID | similarTo
-------+----------
1 | 3
1 | 4
2 | 3
2 | 4
2 | 5
3 | 2
4 | 5
5 | 3
5 | 4
我想要做的是:鉴于一系列已加星标的会谈,我想选择未加星标的会谈中的前2名(已加星标= 0)以及他们的标题和演讲者与星标会谈最相似。问题是获取扬声器需要使用聚合功能,因此获得最相似的会话。
如果没有发言者,我可以使用以下查询进行最相似的会谈:
select t2.talkID, t2.title, count(*) as count
from similarTalks s, talks t1, talks t2
where s.talkID = t1.talkID
and t1.Starred = 1
and s.similarTo = t2.TalkID
and t2.Starred = 0
group by t2.title, t2.talkID
order by count desc
limit 2
通常,我使用以下聚合函数来获取发言者,并按列适当分组(假设t = talkspeaker):
group_concat(t.speaker, ', ') as Speakers
,如
select t1.title, group_concat(t2.speaker, ', ') as Speakers
from talks t1, talkspeaker t2
where t1.talkID = t2.talkID
group by t1.title
但我无法将两者结合在一起。我计划在sqlite数据库(这是group_concat函数的来源)中运行此查询可能很重要。对于与主演的谈话最相似的前2名未加星标的谈话的答案似乎与talkIDs 3和4有关。
答案 0 :(得分:5)
首先,您可能希望阅读this article有关使用ANSI 92联接而不是上面使用的老化ANSI 89的原因。其次,SQLLite支持GROUP_CONCAT函数,因此您可以使用它。
您只需要将第二个查询作为子查询添加到第一个查询中以获得所需的结果:
SELECT Talks.TalkID,
Talks.Title,
ts.Speakers,
COUNT(*) AS SimilarTalks
FROM Talks
INNER JOIN SimilarTalks
ON Talks.TalkID = SimilarTalks.SimilarTo
INNER JOIN Talks t2
ON SimilarTalks.TalkID = t2.TalkID
AND t2.Starred = 1
INNER JOIN
( SELECT TalkID, GROUP_CONCAT(Speaker, ',') AS Speakers
FROM TalkSpeaker
GROUP BY TalkID
) ts
ON ts.TalkID = Talks.TalkID
WHERE Talks.Starred = 0
GROUP BY Talks.TalkID, Talks.Title, ts.Speakers
ORDER BY COUNT(*) DESC
LIMIT 2;
<强> Example on SQL Fiddle 强>
修改强>
您也可以在没有使用DISTINCT
SELECT Talks.TalkID,
Talks.Title,
GROUP_CONCAT(DISTINCT ts.Speaker) AS Speakers,
COUNT(DISTINCT t2.TalkID) AS SimilarTalks
FROM Talks
INNER JOIN SimilarTalks
ON Talks.TalkID = SimilarTalks.SimilarTo
INNER JOIN Talks t2
ON SimilarTalks.TalkID = t2.TalkID
AND t2.Starred = 1
INNER JOIN TalkSpeaker ts
ON ts.TalkID = Talks.TalkID
WHERE Talks.Starred = 0
GROUP BY Talks.TalkID, Talks.Title
ORDER BY COUNT(DISTINCT t2.TalkID) DESC
LIMIT 2;
然而,我认为这种方法没有任何好处,而且可能效率较低(我没有测试过,所以不能确定)
答案 1 :(得分:3)
首先,要获取所需会话的ID,请从第一个查询中删除其他字段:
SELECT unstarred.talkID
FROM talks AS starred
JOIN similarTalks AS s ON starred.talkID = s.talkID
JOIN talks AS unstarred ON s.similarTo = unstarred.talkID
WHERE starred.starred
AND NOT unstarred.starred
GROUP BY unstarred.talkID
ORDER BY COUNT(*) DESC
LIMIT 2
然后,使用它作为子查询来获取有关所需会话的信息:
SELECT t.title AS Title,
group_concat(s.speaker, ', ') AS Speakers
FROM talks AS t JOIN talkspeaker AS s ON t.talkID = s.talkID
WHERE t.talkID IN (SELECT unstarred.talkID
FROM talks AS starred
JOIN similarTalks AS s ON starred.talkID = s.talkID
JOIN talks AS unstarred ON s.similarTo = unstarred.talkID
WHERE starred.starred
AND NOT unstarred.starred
GROUP BY unstarred.talkID
ORDER BY COUNT(*) DESC
LIMIT 2)
GROUP BY t.talkID