尝试修正一个分类不同类型的平均电影评级的作品,按订阅电影流媒体计划的家庭中的孩子数量进行分组。
SELECT DISTINCT c.numbkids "Number of Kids",
(SELECT ROUND(AVG(r.rating),2) FROM netflix.ratings100 r
JOIN netflix.movies_genres g ON r.movieid = g.movieid
JOIN netflix.customers c ON c.custid = r.custid
WHERE g.genrecode LIKE 'ACT') "Average Action Rating",
(SELECT ROUND(AVG(r.rating),2) FROM netflix.ratings100 r
JOIN netflix.movies_genres g ON r.movieid = g.movieid
JOIN netflix.customers c ON c.custid = r.custid
WHERE g.genrecode LIKE 'ADV') "Average Adventure Rating",
(SELECT ROUND(AVG(r.rating),2) FROM netflix.ratings100 r
JOIN netflix.movies_genres g ON r.movieid = g.movieid
JOIN netflix.customers c ON c.custid = r.custid
WHERE g.genrecode LIKE 'COM') "Average Comedy Rating",
(SELECT ROUND(AVG(r.rating),2) FROM netflix.ratings100 r
JOIN netflix.movies_genres g ON r.movieid = g.movieid
JOIN netflix.customers c ON c.custid = r.custid
WHERE g.genrecode LIKE 'MYS') "Average Mystery Rating"
FROM netflix.customers c JOIN netflix.ratings100 r
ON c.custid = r.custid
JOIN netflix.movies_genres g
ON r.movieid = g.movieid
WHERE c.numbkids BETWEEN 1 AND 3
ORDER BY c.numbkids
我不断得到的问题是,对于有1,2和3个孩子的家庭,所显示的平均评分是相同的。我认为它只是给了我整体平均值,而忽略了这样一个事实,即我试图让它按照孩子的数量划分。任何解决方案?
答案 0 :(得分:1)
子查询不限于孩子的数量,并且子查询和主查询之间也没有连接。
你不需要那么多的子选择。它们使您的查询执行缓慢。看看这是否有效!
SELECT c.numbkids "Number of Kids",
ROUND(AVG(CASE WHEN g.genrecode LIKE 'ACT' THEN r.rating ELSE 0 END), 2) AS "Average Action Rating",
ROUND(AVG(CASE WHEN g.genrecode LIKE 'ADV' THEN r.rating ELSE 0 END), 2) AS "Average Adventure Rating",
ROUND(AVG(CASE WHEN g.genrecode LIKE 'COM' THEN r.rating ELSE 0 END), 2) AS "Average Comedy Rating",
ROUND(AVG(CASE WHEN g.genrecode LIKE 'MYS' THEN r.rating ELSE 0 END), 2) AS "Average Mystery Rating"
FROM netflix.customers c JOIN netflix.ratings100 r
ON c.custid = r.custid
JOIN netflix.movies_genres g
ON r.movieid = g.movieid
WHERE c.numbkids BETWEEN 1 AND 3
GROUP BY c.numbkids
ORDER BY c.numbkids;
答案 1 :(得分:0)
我认为这就是您要做的事情(您在查询中通过聚合错过了该组):
SELECT c.numbkids,
ROUND(AVG(CASE WHEN g.genrecode LIKE 'ACT' THEN r.rating END),2) as Average_Action_Rating,
ROUND(AVG(CASE WHEN g.genrecode LIKE 'ADV' THEN r.rating END),2) as Average_Adventure_Rating,
ROUND(AVG(CASE WHEN g.genrecode LIKE 'COM' THEN r.rating END),2) as Average_Comedy_Rating,
ROUND(AVG(CASE WHEN g.genrecode LIKE 'MYS' THEN r.rating END),2) as Average_Mystery_Rating
FROM netflix.ratings100 r
JOIN netflix.movies_genres g ON r.movieid = g.movieid
JOIN netflix.customers c ON c.custid = r.custid
WHERE c.numbkids BETWEEN 1 AND 3
GROUP BY c.numbkids
ORDER BY c.numbkids;
这应该给你一个3行x 5列结构
- 栏目为numbkids + 4个流派的平均评分
- 对应于1,2和3个孩子的行
修改强>
或者,下面的查询将为您提供numbkids x流派级数据:
SELECT c.numbkids, g.genrecode, ROUND(AVG(r.rating),2) as Average_Rating
FROM netflix.ratings100 r
JOIN netflix.movies_genres g ON r.movieid = g.movieid AND g.genrecode IN ('ACT','ADV','COM','MYS')
JOIN netflix.customers c ON c.custid = r.custid
WHERE c.numbkids BETWEEN 1 AND 3
GROUP BY c.numbkids, g.genrecode
ORDER BY c.numbkids, g.genrecode;
答案 2 :(得分:0)
您的子查询没有考虑到孩子的数量,因此会返回相同的值,但子查询不是必需的。 类似下面的东西应该可以工作,(没有测试,因为我没有必要的表/数据设置)
SELECT DISTINCT c.numbkids "Number of Kids",
ROUND(AVG(case g.genrecode when 'ACT' then r.rating else null end),2) "Average Action Rating",
ROUND(AVG(case g.genrecode when 'ADV' then r.rating else null end),2) "Average Adventure Rating",
ROUND(AVG(case g.genrecode when 'COM' then r.rating else null end),2) "Average Comedy Rating",
ROUND(AVG(case g.genrecode when 'MYS' then r.rating else null end),2) "Average Mystery Rating"
FROM netflix.customers c
JOIN netflix.ratings100 r ON c.custid = r.custid
JOIN netflix.movies_genres g ON r.movieid = g.movieid
WHERE c.numbkids BETWEEN 1 AND 3
ORDER BY c.numbkids