我有一个数据集,所有与图书相关的作者都列在下面
| Book ID | Author |
|---------|--------|
| 1 | A |
| 1 | B |
| 1 | C |
| 2 | A |
| 2 | X |
| 3 | P |
| 3 | C |
| 4 | Q |
| 4 | B |
我想知道哪位作者是顶级合作者,意味着与不同的作者合作。像上面的情况一样
A与B,C,X合作
B -> A, C, Q
C -> A, B, P
P -> C
X -> A
Q -> B
我尝试了几种组连续组合,但期望的结果不会出现
答案 0 :(得分:1)
一种方法是使用自联接来获取协作者。然后,您可以使用count(distinct)
来计算它们:
select ba.author, count(distinct ba2.author) as num_collaborators,
group_concat(distinct ba2.author0 as collaborators
from book_authors ba join
book_authors ba2
on ba.book_id = ba2.book_id and ba.author <> ba2.author
group by ba.author
order by num_collaborators desc;