计算hive mapreduce中的desc

时间:2016-09-28 17:16:15

标签: hadoop mapreduce hive

我在hive中有一个表,其中包含questionid,questiontag,answerID,userIDofanswerer

我需要此数据集中十大最常用的标签。

我试过了:

从表GROUP BY标签中选择count(questionID),问号标签;

但如何按Count(questionID)

订购

2 个答案:

答案 0 :(得分:1)

在下面的查询ORDER BY cnt DESC LIMIT 10中,将选择前10个最常用的代码:

    SELECT count(questionID) cnt ,
           questiontag 
      FROM table 
  GROUP BY questiontag 
  ORDER BY cnt DESC 
  LIMIT 10;

count(*)将计算所有行,包括NULL questionID

count(questionID)将仅计算questionID不为NULL的行

答案 1 :(得分:0)

尝试以下

select count(questionID) as cnt,questiontag from table GROUP BY questiontag
order by cnt desc limit 10;