我一直在尝试提高SQL连接技能。我正在使用经典的示例DVD出租数据库(可以在here中找到 )。我试图通过统计该演员在客户租借的所有电影中出现的所有出场次数,来确定一位顾客最喜欢的演员。
现在我有这个带有3个子查询的庞然大物查询。
SELECT email, actor.last_name, count(actor.last_name)
FROM (SELECT email, actor_id
FROM (SELECT email, film_id
FROM (SELECT email, inventory_id
FROM customer as cu
JOIN rental ON cu.customer_id = rental.customer_id
ORDER BY email) as sq
JOIN inventory ON sq.inventory_id = inventory.inventory_id) as sq2
JOIN film_actor ON sq2.film_id = film_actor.film_id) as sq3
JOIN actor ON sq3.actor_id = actor.actor_id
GROUP BY email, actor.last_name
ORDER BY COUNT(actor.last_name) DESC;
最后我得到的是电子邮件的完整列表,演员的姓氏和出场总数,就像这样- 电子邮件
email actor.last_name count
"debra.nelson@sakilacustomer.org" "Nolte" "12"
"nathan.runyon@sakilacustomer.org" "Guiness" "11"
"margie.wade@sakilacustomer.org" "Temple" "11"
"marsha.douglas@sakilacustomer.org" "Kilmer" "11"
"veronica.stone@sakilacustomer.org" "Nolte" "11"
"wendy.harrison@sakilacustomer.org" "Willis" "10" etc
如何修改查询,以便仅获得每封电子邮件的最佳执行者,有没有办法使此查询更简单并产生相同的结果?
答案 0 :(得分:1)
在简化此查询方面,请记住使用表别名。
您的查询充满了不必要的子查询,可以将其归结为:
SELECT cu.email, act.last_name, count(act.last_name)
FROM customer as cu
JOIN rental as ren ON cu.customer_id = ren.customer_id
JOIN inventory as inv ON ren.inventory_id = inv.inventory_id
JOIN film_actor as fil ON inv.film_id = fil.film_id
JOIN actor as act ON act.actor_id = fil.actor_id
group by cu.email,act.last_name
接下来,要获得每个电子邮件地址的最高执行者,我们可以应用row_number()窗口函数,然后在行号= 1的子查询中缩小结果范围:
Select x.email,x.last_name,x.count from (
SELECT cu.email, act.last_name, count(act.last_name)
,row_number() over(partition by email order by COUNT(act.last_name) DESC )
FROM customer as cu
JOIN rental as ren ON cu.customer_id = ren.customer_id
JOIN inventory as inv ON ren.inventory_id = inv.inventory_id
JOIN film_actor as fil ON inv.film_id = fil.film_id
JOIN actor as act ON act.actor_id = fil.actor_id
group by cu.email,act.last_name
) as x
where row_number = 1
ORDER BY x.count DESC;