我想找到每种类型的电影,找到在大多数类型的电影中播放的N个演员
Tables and their columns:
actor(actor_id,name)
role(actor_id,movie_id)
movie(movie_id,title)
movie_has_genre(movie_id,genre_id)
genre(genre_id,genre_name)
通过此查询,我可以找到播放同一类型电影的演员。
select t1.genre_name, t1.actor_id, t1.max_value
from
(
select g.genre_name, a.actor_id, count(*) as max_value
from genre g
inner join movie_has_genre mhg on mhg.genre_id = g.genre_id
inner join movie m on mhg.movie_id = m.movie_id
inner join role r on m.movie_id = r.movie_id
inner join actor a on a.actor_id = r.actor_id
group by g.genre_name, a.actor_id
) t1
inner join
(
select genre_name, MAX(max_value) AS max_value
from
(
select g.genre_name, a.actor_id, count(*) as max_value
from genre g
inner join movie_has_genre mhg on mhg.genre_id = g.genre_id
inner join movie m on mhg.movie_id = m.movie_id
inner join role r on m.movie_id = r.movie_id
inner join actor a on a.actor_id = r.actor_id
group by g.genre_name, a.actor_id
) t
GROUP BY genre_name
) t2
ON t1.genre_name = t2.genre_name and t1.max_value = t2.max_value
ORDER BY
t1.max_value desc;
但是我想把演员的数量限制为1.那我怎么能这样做呢?
示例:
我得到的结果:
genre_name | actor_id | max_value
==================================
Thriller | 22591 | 7
Drama | 22591 | 6
Crime | 65536 | 3
Horror | 22591 | 3
Action | 292028 | 3
Action | 378578 | 3
Action | 388698 | 3
我想要的结果:
genre_name | actor_id | max_value
==================================
Thriller | 22591 | 7
Drama | 22591 | 6
Crime | 65536 | 3
Horror | 22591 | 3
Action | 292028 | 3
答案 0 :(得分:0)
如果您只想随机选择一个演员,只需在代码中添加以下行:
select genre_name, actor_id, max_value
from
(
select g.genre_name, a.actor_id, count(*) as max_value
from genre g
inner join movie_has_genre mhg on mhg.genre_id = g.genre_id
inner join movie m on mhg.movie_id = m.movie_id
inner join role r on m.movie_id = r.movie_id
inner join actor a on a.actor_id = r.actor_id
group by g.genre_name, a.actor_id
) t1
inner join
(
select genre_name, MAX(max_value) AS max_value
from
(
select g.genre_name, a.actor_id, count(*) as max_value
from genre g
inner join movie_has_genre mhg on mhg.genre_id = g.genre_id
inner join movie m on mhg.movie_id = m.movie_id
inner join role r on m.movie_id = r.movie_id
inner join actor a on a.actor_id = r.actor_id
group by g.genre_name, a.actor_id
) t
GROUP BY genre_name
) t2
USING(genre_name,max_value)
GROUP BY genre_name, max_value
ORDER BY max_value desc;
答案 1 :(得分:0)
您使用的某些联接是多余的。
SELECT
U.genre_name, U.actor_id, U.actor_genre_count
FROM
(SELECT
A.genre_id, A.genre_name, C.actor_id, count(*) actor_genre_count
FROM genre A
JOIN movie_has_genre B
ON A.genre_id=B.genre_id
JOIN role C
ON C.movie_id=B.movie_id
GROUP BY A.genre_id, A.genre_name, C.actor_id) U
JOIN
(SELECT
S.genre_id, S.genre_name, MAX(S.actor_genre_count) max_actor_genre
FROM
(SELECT
A.genre_id, A.genre_name, C.actor_id, count(*) actor_genre_count
FROM genre A
JOIN movie_has_genre B
ON A.genre_id=B.genre_id
JOIN role C
ON C.movie_id=B.movie_id
GROUP BY A.genre_id, A.genre_name, C.actor_id) S
GROUP BY S.genre_id, S.genre_name) V
ON U.genre_name=V.genre_name AND U.actor_genre_count=V.max_actor_genre;
答案 2 :(得分:0)
该解决方案改编自this Stack Overflow answer关于限制名称的结果。我试图做一个类似的查询,应该选择第一个actor_id并且只返回它。
SELECT id, CategoryName, image, date_listed, item_id
SELECT t1.genre_name, t1.actor_id, t1.actor_movie_count
FROM
(
SELECT g.genre_name, r.actor_id, COUNT(*) as actor_movie_count
FROM genre g
INNER JOIN movie_has_genre mhg ON mhg.genre_id = g.genre_id
INNER JOIN role r ON m.movie_id = r.movie_id
GROUP BY g.genre_name, r.actor_id
) t1
LEFT JOIN
(
SELECT genre_name, actor_id, MAX(actor_movie_count) AS max_actor_movie_count
FROM
(
SELECT g.genre_name, r.actor_id, COUNT(*) AS actor_movie_count
FROM genre g
INNER JOIN movie_has_genre mhg ON mhg.genre_id = g.genre_id
INNER JOIN role r ON m.movie_id = r.movie_id
GROUP BY g.genre_name, r.actor_id
)
GROUP BY genre_name
) t2
ON t1.genre_name = t2.genre_name AND t1.actor_movie_count = t2.max_actor_movie_count AND (t1.actor_id > t2.actor_id)
WHERE t2.genre_id IS NULL
ORDER BY t1.actor_movie_count DESC
如果这仍然无法解决您的问题,其他类似的问题及解释说明如下:
SO answer about returning 1 row per group
SO question about limiting query answer to N results per group
答案 3 :(得分:0)
您可以使用相关的LIMIT 1
子查询来获取播放该类型的演员的id
。
select g.genre_name, (
select r.actor_id
from movie_has_genre mg
join role r on r.movie_id = mg.movie_id
where mg.genre_id = g.genre_id
group by r.actor_id
order by count(*) desc,
r.actor_id asc -- on tie least actor_id wins
) as actor_id
from genre g
结果如下:
genre_name | actor_id
======================
Thriller | 22591
Drama | 22591
Crime | 65536
Horror | 22591
Action | 292028
如您所见,计数不包括在内。如果您需要计数,那么简单的方法是使用actor_id
将子查询中的SELECT子句更改为
select concat(r.actor_id, ':', count(*)) as actor_id_count
这将返回actor_id和单个字符串列中的计数,如
genre_name | actor_id_count
===========================
Thriller | 22591:7
然后,您可以在应用程序代码中解析它(使用split
,explode
或其他任何内容。
CTE (公用表表达式)和ROW_NUMBER()
(窗口函数)的解决方案(由 MySQL支持) 8 和 MariaDB 10.2 )可能是:
with cte as (
select g.genre_name, r.actor_id, count(*) as max_value,
row_number() over (partition by g.genre_name order by count(*) desc, r.actor_id) as rn
from genre g
inner join movie_has_genre mhg on mhg.genre_id = g.genre_id
inner join role r on mhg.movie_id = r.movie_id
group by g.genre_name, r.actor_id
)
select genre_name, actor_id, max_value from cte where rn = 1