特别是,这些表格是:文章,作者,作者_文章(连接作者和文章),主题领域(作者的主题领域),以及authors_subjectareas(将作者连接到他们的主题领域)。
我希望逐行阅读文章表,找到每篇文章的作者,然后转到他们的主题领域,并计算该文章的所有共同作者的主题领域,最后分配主题领域。该文章的最大频率。我编写如下代码,但问题是它是针对所有文章而不是单独为每篇文章做的!
select art.name as title, art.theAbstract as abstract, sub.name as subjectArea
from
articles as art, authors as aut, subjectareas as sub, authors_articles as aa,
authors_subjectareas as asub
where
art.id = aa.article and aut.id = aa.author and asub.author = aut.id and
sub.id = asub.subjectArea and (art.year >= 2000 and art.year <= 2004)
group by subjectArea
Order by count(subjectArea) DESC
LIMIT 1
非常感谢您的评论......
答案 0 :(得分:0)
您正在寻求从物化表中获取groupwise maximum:
SELECT t2.name AS title, t2.theAbstract AS abstract, sub.name AS subjectArea
FROM (
-- get each article's maximum co-author subject frequency
SELECT art.id, MAX(freq) freq FROM (
-- the subject frequencies of each article
SELECT art.id, COUNT(*) freq
FROM authors_articles aa
JOIN authors_subjectareas asub USING (author)
JOIN articles art ON art.id = aa.article
GROUP BY art.id, asub.subjectArea
) t
) t1 NATURAL JOIN (
-- the information we actually want
SELECT art.id, art.name, art.theAbstract, asub.subjectArea, COUNT(*) freq
FROM authors_articles aa
JOIN authors_subjectareas asub USING (author)
JOIN articles art ON art.id = aa.article
GROUP BY art.id, asub.subjectArea
) t2 JOIN subjectareas sub ON sub.id = t2.subjectArea