我在HIVE中实现了以下查询:
SELECT title, rating FROM
(
SELECT m.title as title, variance(r.rating) as var, r.rating as rating, r.time_stamp as time_stamp
FROM movies m JOIN ratings r ON m.movieid = r.movieid
DISTRIBUTE BY m.title, r.rating
GROUP BY m.title
SORT BY m.title, r.rating
) A
WHERE year(from_unixtime(time_stamp)) = '2015'
GROUP BY title
LIMIT 10;
但是我收到以下错误:
Error while compiling statement: FAILED: ParseException line 6:4 missing ) at 'GROUP' near 'GROUP' line 6:10 missing EOF at 'BY' near 'GROUP'
答案 0 :(得分:0)
我认为这就是你想要的:
SELECT m.movieid, m.title, variance(r.rating) as var
FROM movies m JOIN
ratings r
ON m.movieid = r.movieid
WHERE year(from_unixtime(time_stamp)) = 2015
GROUP BY m.movieid, m.title
ORDER BY var DESC
LIMIT 10;
答案 1 :(得分:0)
YEAR
返回一个整数(P.s.评级未分区?)
- 您应该有充分的理由使用来自Hive开始时间的技术条款DISTRIBUTE BY
和SORT BY
。
select m.title
,r.var
from (select r.movieid
,variance(r.rating) as var
from ratings as r
where year(from_unixtime(time_stamp)) = 2015
group by r.movieid
order by var desc
limit 10
) as r
join movies as m
on m.movieid =
r.movieid
;