使用以下数据库结构:
movies_directors(director_id:int,movie_id:int)
角色(actor_id:int,movie_id:int,roles:string)
我如何找到导演与大部分电影合作过的演员?
directors_id, actor_id
我在尝试查询后得到的结果是:
DIRECTOR_ID ACTOR_ID合作
101 1 2
102 6 1
105 4 1
101 2 1
104 8 1
101 3 1
103 7 1
100 11 1
101 10 1
100 5 1
104 2 1
DIRECTOR_ID ACTOR_ID合作
101 9 1
我只需要每位导演和一位演员合拍最大电影的组合。例如,对于director_id 101,只应显示101 1 2条目。
答案 0 :(得分:2)
也许您应该使用TDQD - 测试驱动的查询设计。
第一步是确定哪些导演与电影中的哪些演员合作,这是通过加入Movie_ID栏上的两个表找到的:
SELECT d.Movie_ID, d.Director_ID, A.Actor_ID
FROM Movies_Directors AS d
JOIN Roles AS a
ON d.Movie_ID = a.Movie_ID
由于您没有告诉我们表格的主键,我们无法判断单个演员是否可以录制为在一部电影中具有多个不同的角色。我将假设Roles表的主键是组合(Movie_ID,Actor_ID)。
我们需要根据上面的查询计算每个actor和director组合的行数:
SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS num_joint_movies
FROM Movies_Directors AS d
JOIN Roles AS a
ON d.Movie_ID = a.Movie_ID
GROUP BY d.Director_ID, A.Actor_ID
<3>问题3:对于每个导演,他们与演员合作的最大次数是多少
我们现在需要从上面的结果中找到每个导演的最大联合电影数量。这需要将上面的查询视为表格,如下所示:
SELECT n.Director_ID, MAX(n.num_joint_movies) AS max_joint_movies
FROM (SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS num_joint_movies
FROM Movies_Directors AS d
JOIN Roles AS a
ON d.Movie_ID = a.Movie_ID
GROUP BY d.Director_ID, A.Actor_ID
) AS n
GROUP BY n.Director_ID
现在我们需要结合查询Q2和Q3来获取演员:
SELECT q3.Director_ID, q2.Actor_ID
FROM (SELECT n.Director_ID, MAX(n.Num_Joint_Movies) AS Max_Joint_Movies
FROM (SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS Num_Joint_Movies
FROM Movies_Directors AS d
JOIN Roles AS a
ON d.Movie_ID = a.Movie_ID
GROUP BY d.Director_ID, A.Actor_ID
) AS n
GROUP BY n.Director_ID
) AS q3
JOIN (SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS Num_Joint_Movies
FROM Movies_Directors AS d
JOIN Roles AS a
ON d.Movie_ID = a.Movie_ID
GROUP BY d.Director_ID, A.Actor_ID
) AS q2
ON q3.Director_ID = q2.Director_ID AND q3.Max_Joint_Movies = q2.Num_Joint_Movies
SQL标准允许使用WITH子句引入的公用表表达式来简化这样的查询:
WITH cte AS
(SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS Num_Joint_Movies
FROM Movies_Directors AS d
JOIN Roles AS a
ON d.Movie_ID = a.Movie_ID
GROUP BY d.Director_ID, A.Actor_ID
)
SELECT q3.Director_ID, cte.Actor_ID
FROM (SELECT cte.Director_ID, MAX(cte.Num_Joint_Movies) AS Max_Joint_Movies
FROM cte
GROUP BY cte.Director_ID
) AS q3
JOIN cte
ON q3.Director_ID = cte.Director_ID AND q3.Max_Joint_Movies = cte.Num_Joint_Movies
由于问题自我开始回答以来有所改变,上面显示的结果可能不是所需要的 - 尽管修订后的问题并没有说明需要什么。但是,将问题分解为可回答的子查询的一般技术是有价值的;这是我如何处理任何类似的查询。如果您正在寻找合作最多的Director + Actor的单一组合,那么我们需要修改Q3以找到所有导演中联合电影的最大数量:
SELECT MAX(n.num_joint_movies) AS max_joint_movies
FROM (SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS num_joint_movies
FROM Movies_Directors AS d
JOIN Roles AS a
ON d.Movie_ID = a.Movie_ID
GROUP BY d.Director_ID, A.Actor_ID
) AS n
我们现在需要再次将Q6与Q2结合起来:
SELECT q2.Director_ID, q2.Actor_ID
FROM (SELECT MAX(n.Num_Joint_Movies) AS Max_Joint_Movies
FROM (SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS Num_Joint_Movies
FROM Movies_Directors AS d
JOIN Roles AS a
ON d.Movie_ID = a.Movie_ID
GROUP BY d.Director_ID, A.Actor_ID
) AS n
) AS q3
JOIN (SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS Num_Joint_Movies
FROM Movies_Directors AS d
JOIN Roles AS a
ON d.Movie_ID = a.Movie_ID
GROUP BY d.Director_ID, A.Actor_ID
) AS q2
ON q3.Max_Joint_Movies = q2.Num_Joint_Movies
答案 1 :(得分:2)
您必须使用max()
功能。
SELECT
max(collaborations)
FROM movies_directors mov
INNER JOIN roles r
on r.movie_id=mov.movie_id
GROUP BY mov.director_id,r.actor_id
答案 2 :(得分:0)
这是完成您想要的简洁方法:
select top (1) with ties
M.director_id,
R.actor_id,
COUNT(movie_id) as MovieCount
from movies_directors as M
join roles as R
on M.movie_id = R.movie_id
group by
M.director_id,
R.actor_id
order by rank() over (
partition by director_id
order by COUNT(movie_id) desc
)