查询以查找两个字段之间的最大关系

时间:2013-02-01 03:06:16

标签: sql sql-server

使用以下数据库结构:

  

movies_directors(director_id:int,movie_id:int)
  角色(actor_id:int,movie_id:int,roles:string)

我如何找到导演与大部分电影合作过的演员?

directors_id, actor_id

我在尝试查询后得到的结果是:

DIRECTOR_ID ACTOR_ID合作


    101          1              2
    102          6              1
    105          4              1
    101          2              1
    104          8              1
    101          3              1
    103          7              1
    100         11              1
    101         10              1
    100          5              1
    104          2              1

DIRECTOR_ID ACTOR_ID合作


    101          9              1

我只需要每位导演和一位演员合拍最大电影的组合。例如,对于director_id 101,只应显示101 1 2条目。

3 个答案:

答案 0 :(得分:2)

也许您应该使用TDQD - 测试驱动的查询设计。

Q1:哪些导演与哪些演员合作

第一步是确定哪些导演与电影中的哪些演员合作,这是通过加入Movie_ID栏上的两个表找到的:

SELECT d.Movie_ID, d.Director_ID, A.Actor_ID
  FROM Movies_Directors AS d
  JOIN Roles AS a
    ON d.Movie_ID = a.Movie_ID

由于您没有告诉我们表格的主键,我们无法判断单个演员是否可以录制为在一部电影中具有多个不同的角色。我将假设Roles表的主键是组合(Movie_ID,Actor_ID)。

Q2:每个导演与每个演员合作多少次

我们需要根据上面的查询计算每个actor和director组合的行数:

SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS num_joint_movies
  FROM Movies_Directors AS d
  JOIN Roles AS a
    ON d.Movie_ID = a.Movie_ID
 GROUP BY d.Director_ID, A.Actor_ID
<3>问题3:对于每个导演,他们与演员合作的最大次数是多少

我们现在需要从上面的结果中找到每个导演的最大联合电影数量。这需要将上面的查询视为表格,如下所示:

SELECT n.Director_ID, MAX(n.num_joint_movies) AS max_joint_movies
  FROM (SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS num_joint_movies
          FROM Movies_Directors AS d
          JOIN Roles AS a
            ON d.Movie_ID = a.Movie_ID
         GROUP BY d.Director_ID, A.Actor_ID
       ) AS n
 GROUP BY n.Director_ID

问题4:哪位演员与每位导演合作最多

现在我们需要结合查询Q2和Q3来获取演员:

SELECT q3.Director_ID, q2.Actor_ID
  FROM (SELECT n.Director_ID, MAX(n.Num_Joint_Movies) AS Max_Joint_Movies
          FROM (SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS Num_Joint_Movies
                  FROM Movies_Directors AS d
                  JOIN Roles AS a
                    ON d.Movie_ID = a.Movie_ID
                 GROUP BY d.Director_ID, A.Actor_ID
               ) AS n
         GROUP BY n.Director_ID
       ) AS q3
  JOIN (SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS Num_Joint_Movies
          FROM Movies_Directors AS d
          JOIN Roles AS a
            ON d.Movie_ID = a.Movie_ID
         GROUP BY d.Director_ID, A.Actor_ID
        ) AS q2
     ON q3.Director_ID = q2.Director_ID AND q3.Max_Joint_Movies = q2.Num_Joint_Movies

Q5:使用公用表表达式(CTE)

SQL标准允许使用WITH子句引入的公用表表达式来简化这样的查询:

WITH cte AS
    (SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS Num_Joint_Movies
       FROM Movies_Directors AS d
       JOIN Roles AS a
         ON d.Movie_ID = a.Movie_ID
      GROUP BY d.Director_ID, A.Actor_ID
    )
SELECT q3.Director_ID, cte.Actor_ID
  FROM (SELECT cte.Director_ID, MAX(cte.Num_Joint_Movies) AS Max_Joint_Movies
          FROM cte
         GROUP BY cte.Director_ID
       ) AS q3
  JOIN cte
    ON q3.Director_ID = cte.Director_ID AND q3.Max_Joint_Movies = cte.Num_Joint_Movies

问题6:任何董事的最大合作数

由于问题自我开始回答以来有所改变,上面显示的结果可能不是所需要的 - 尽管修订后的问题并没有说明需要什么。但是,将问题分解为可回答的子查询的一般技术是有价值的;这是我如何处理任何类似的查询。如果您正在寻找合作最多的Director + Actor的单一组合,那么我们需要修改Q3以找到所有导演中联合电影的最大数量:

SELECT MAX(n.num_joint_movies) AS max_joint_movies
  FROM (SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS num_joint_movies
          FROM Movies_Directors AS d
          JOIN Roles AS a
            ON d.Movie_ID = a.Movie_ID
         GROUP BY d.Director_ID, A.Actor_ID
       ) AS n

问题7:演员以及最常合作的导演

我们现在需要再次将Q6与Q2结合起来:

SELECT q2.Director_ID, q2.Actor_ID
  FROM (SELECT MAX(n.Num_Joint_Movies) AS Max_Joint_Movies
          FROM (SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS Num_Joint_Movies
                  FROM Movies_Directors AS d
                  JOIN Roles AS a
                    ON d.Movie_ID = a.Movie_ID
                 GROUP BY d.Director_ID, A.Actor_ID
               ) AS n
       ) AS q3
  JOIN (SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS Num_Joint_Movies
          FROM Movies_Directors AS d
          JOIN Roles AS a
            ON d.Movie_ID = a.Movie_ID
         GROUP BY d.Director_ID, A.Actor_ID
        ) AS q2
     ON q3.Max_Joint_Movies = q2.Num_Joint_Movies

答案 1 :(得分:2)

您必须使用max()功能。

SELECT
max(collaborations) 
FROM movies_directors mov 
INNER JOIN roles r 
     on r.movie_id=mov.movie_id 
GROUP BY mov.director_id,r.actor_id

答案 2 :(得分:0)

这是完成您想要的简洁方法:

  select top (1) with ties
    M.director_id,
    R.actor_id,
    COUNT(movie_id) as MovieCount
  from movies_directors as M
  join roles as R
  on M.movie_id = R.movie_id
  group by
    M.director_id,
    R.actor_id
  order by rank() over (
  partition by director_id
  order by COUNT(movie_id) desc
  )