注意:您可以在此处找到我之前的问题及其答案 - MySQL: Writing a complex query
我有3张桌子。
表Words_Learned
包含用户已知的所有单词,以及单词学习的顺序。它有3列1)单词ID和2)用户ID和3)学习单词的顺序。
表Article
包含文章。它有3列1)文章ID,2)唯一字数和3)文章内容。
表Words
包含每篇文章中包含的所有唯一单词的列表。它有2列1)字ID和2)文章ID
数据库图如下/
您可以从此处下载数据库代码:https://www.dropbox.com/s/3gr659y5mk05i5w/tests.sql?dl=0
现在,使用这个数据库并使用“only”mysql,我需要做以下工作。
给定一个用户ID,它应该得到该用户已知的所有单词的列表,按照学习它们的顺序排序。换句话说,最近学过的单词将位于列表的顶部。
假设对用户ID的查询显示他们已经记住了以下3个单词,并且我们跟踪他们学习单词的顺序。 八达通 - 3 狗 - 2 勺子 - 1
首先,我们获得包含Octopus一词的所有文章的列表,然后使用表Words
对这些文章进行计算。计算意味着如果该文章包含超过10个未出现在用户词汇表列表中的单词(从表words_learned
中提取),则它将从列表中排除。
然后,我们查询包含dog的所有记录,但不要包含“章鱼”
然后,我们查询包含勺子的所有记录,但不要包含八达通或狗的字样
在我们找到符合此条件的100条记录之前,您一直在执行此重复过程。
为了实现这个过程,我做了以下
SELECT `words_learned`.`idwords`,
Words.`idArticle`
FROM words_learned
INNER JOIN Words ON Words.idWords = Words_Learned.`idwords`
WHERE words_learned.userId = 1
ORDER BY Words_Learned.`order` DESC
在我的查询中,我已经介绍了获取文章,这意味着在这里 - First we get a list of all articles containing the word Octopus, and then do the calculation using table Words on just those articles.
。但是我需要做些什么来覆盖其余的呢?
更新
这是一个更好理解的博士。
Do while articles found < 100
{
for each ($X as known words, in order that those words were learned)
{
Select all articles that contain the word $X, where the 1) article has not been included in any previous loops, and 2)where the count of "unknown" words is less than 10.
Keep these articles in order.
}
}
答案 0 :(得分:1)
所以,我认为就是这样。你想得到“最好的”100篇文章,其中“最好”意味着后来学到的一个词包含的越好。所以我查找每篇文章的最后学习单词(每篇文章的max(words_learned.order))。然后我按顺序显示文章ID并停在100。
select w.idarticle, max(l.`order`)
from words w
join words_learned l on l.idwords = w.idwords and l.userid = 123
group by w.idarticle
order by max(l.`order`) desc
limit 100;
您已编辑了您的请求。您希望将结果限制为包含不超过十个未知单词的文章。为此,您现在必须外部加入学习单词,这样您就可以计算未知单词(即外连接记录)。使用HAVING从列表中删除不需要的文章。
select w.idarticle, max(l.`order`)
from words w
left join words_learned l on l.idwords = w.idwords and l.iduser = 123
group by w.idarticle
having sum(l.idwords is null) <= 10 and max(l.`order`) is not null
order by max(l.`order`) desc
limit 100;
答案 1 :(得分:1)
我很想得到一个子查询,它可以获取一个人所学过的所有单词并将其与自身联系起来,并将GROUP_CONCAT一起与计数结合起来。所以给予: -
Octopus, NULL, 0
Dog, "Octopus", 1
Spoon, "Octopus,Dog", 2
因此子查询将类似于: -
SELECT sub0.idwords, GROUP_CONCAT(sub1.idwords) AS excl_words, COUNT(sub1.idwords) AS older_words_cnt
FROM words_learned sub0
LEFT OUTER JOIN words_learned sub1
ON sub0.userId = sub1.userId
AND sub0.order_learned < sub1.order_learned
WHERE sub0.userId = 1
GROUP BY sub0.idwords
给予
idwords excl_words older_words_cnt
1 NULL 0
2 1 1
3 1,2 2
然后将其结果与其他表联系起来,检查主要idwords匹配但没有找到其他表的文章。
像这样的东西(虽然没有测试为没有测试数据): -
SELECT sub_words.idwords, words_inc.idArticle
(
SELECT sub0.idwords, SUBSTRING_INDEX(GROUP_CONCAT(sub1.idwords), ',', 10) AS excl_words, COUNT(sub1.idwords) AS older_words_cnt
FROM words_learned sub0
LEFT OUTER JOIN words_learned sub1
ON sub0.userId = sub1.userId
AND sub0.order_learned < sub1.order_learned
WHERE sub0.userId = 1
GROUP BY sub0.idwords
) sub_words
INNER JOIN words words_inc
ON sub_words.idwords = words_inc.idwords
LEFT OUTER JOIN words words_exc
ON words_inc.idArticle = words_exc.idArticle
AND FIND_IN_SET(words_exc.idwords, sub_words.excl_words)
WHERE words_exc.idwords IS NULL
ORDER BY older_words_cnt
LIMIT 100
编辑 - 更新以排除超过10个尚未学习的文章。
SELECT sub_words.idwords, words_inc.idArticle,
sub2.idArticle, sub2.count, sub2.content
FROM
(
SELECT sub0.idwords, GROUP_CONCAT(sub1.idwords) AS excl_words, COUNT(sub1.idwords) AS older_words_cnt
FROM words_learned sub0
LEFT OUTER JOIN words_learned sub1
ON sub0.userId = sub1.userId
AND sub0.order_learned < sub1.order_learned
WHERE sub0.userId = 1
GROUP BY sub0.idwords
) sub_words
INNER JOIN words words_inc
ON sub_words.idwords = words_inc.idwords
INNER JOIN
(
SELECT a.idArticle, a.count, a.content, SUM(IF(c.idwords_learned IS NULL, 1, 0)) AS unlearned_words_count
FROM Article a
INNER JOIN words b
ON a.idArticle = b.idArticle
LEFT OUTER JOIN words_learned c
ON b.idwords = c.idwords
AND c.userId = 1
GROUP BY a.idArticle, a.count, a.content
HAVING unlearned_words_count < 10
) sub2
ON words_inc.idArticle = sub2.idArticle
LEFT OUTER JOIN words words_exc
ON words_inc.idArticle = words_exc.idArticle
AND FIND_IN_SET(words_exc.idwords, sub_words.excl_words)
WHERE words_exc.idwords IS NULL
ORDER BY older_words_cnt
LIMIT 100
编辑 - 尝试评论上述查询: -
这只是选择列
SELECT sub_words.idwords, words_inc.idArticle,
sub2.idArticle, sub2.count, sub2.content
FROM
此子查询获取每个学习的单词,以及带有较大order_learned的单词的逗号分隔列表。这是针对特定用户ID的
(
SELECT sub0.idwords, GROUP_CONCAT(sub1.idwords) AS excl_words, COUNT(sub1.idwords) AS older_words_cnt
FROM words_learned sub0
LEFT OUTER JOIN words_learned sub1
ON sub0.userId = sub1.userId
AND sub0.order_learned < sub1.order_learned
WHERE sub0.userId = 1
GROUP BY sub0.idwords
) sub_words
这只是为了获取文章(即从上面的子查询中学习的单词)用于
INNER JOIN words words_inc
ON sub_words.idwords = words_inc.idwords
此子查询获取特定用户尚未学习的文章少于10个单词。
INNER JOIN
(
SELECT a.idArticle, a.count, a.content, SUM(IF(c.idwords_learned IS NULL, 1, 0)) AS unlearned_words_count
FROM Article a
INNER JOIN words b
ON a.idArticle = b.idArticle
LEFT OUTER JOIN words_learned c
ON b.idwords = c.idwords
AND c.userId = 1
GROUP BY a.idArticle, a.count, a.content
HAVING unlearned_words_count < 10
) sub2
ON words_inc.idArticle = sub2.idArticle
此连接用于查找在第一个子查询的逗号分页列表中包含单词的文章(即具有较大order_learned的单词)。这是作为LEFT OUTER JOIN完成的,因为我想要排除找到的任何单词(这是通过检查NULL在WHERE子句中完成的)
LEFT OUTER JOIN words words_exc
ON words_inc.idArticle = words_exc.idArticle
AND FIND_IN_SET(words_exc.idwords, sub_words.excl_words)
WHERE words_exc.idwords IS NULL
ORDER BY older_words_cnt
LIMIT 100
答案 2 :(得分:1)
我再次阅读了这个问题并注意到它要复杂得多。
首先,你要展示单词。而且你是否显示一个单词取决于该单词和所有之前学过的单词(以及它们出现的文章)。
所以学到了这些话:
word order Octopus 3 Dog 2 Spoon 1 (i.e.first learned)
这些文章:
article contains Octopus contains Dog contains spoon unknown words A yes yes yes 5 B yes yes no 11 C yes no yes 15 D no yes yes 2 E no yes no 0 F no no yes 8 G no no no 3 H no no no 20
你...
所以你展示“狗”和“勺子”而不是“章鱼”。如果不只有两场比赛,而是千场比赛,那么你将展示前100场,然后停下来。
鉴于此算法,我们可以得出结论:
查询:
select idwords
from words_learned
where userid = 123
and not exists
(
select w.idarticle
from words w
left join words_learned l on l.idwords = w.idwords and l.userid = 123
group by w.idarticle
having sum(l.idwords is null) > 10 and max(l.`order`) = words_learned.`order`
)
order by `order` desc
limit 100;
这是一个SQL小提琴:http://sqlfiddle.com/#!2/19bf8/1。