如何分类到年龄范围并找到前3名?

时间:2017-03-19 14:25:58

标签: hadoop hive hiveql

有一个带有年龄栏的蜂巢表,其中包含人的年龄。 必须计算并显示前3个年龄段。 例如:是否低于10,10-15,15-20,20-25,25-30 ...... 哪个年龄段更多。 请建议我这样做。

1 个答案:

答案 0 :(得分:2)

select      case 
                when age <= 10 then '0-10'
                else concat_ws
                     (
                         '-'
                        ,cast(floor(age/5)*5 as string)
                        ,cast((floor(age/5)+1)*5 as string)
                     )
            end                 as age_group
           ,count(*)            as cnt

from        mytable

group by    1

order by    cnt desc

limit       3
;

您可能需要设置此参数:

set hive.groupby.orderby.position.alias=true;

演示

with        mytable as 
            (
                select  floor(rand()*100) as age 
                from    (select 1) x lateral view explode(split(space(100),' ')) pe
            )

select      case 
                when age <= 10 then '0-10'
                else concat_ws('-',cast(floor(age/5)*5 as string),cast((floor(age/5)+1)*5 as string))
            end                             as age_group

           ,count(*)                        as cnt
           ,sort_array(collect_list(age))   as age_list

from        mytable

group by    1

order by    cnt desc
;
+-----------+-----+------------------------------+
| age_group | cnt |           age_list           |
+-----------+-----+------------------------------+
| 0-10      |   9 | [0,0,1,3,3,6,8,9,10]         |
| 25-30     |   9 | [26,26,28,28,28,28,29,29,29] |
| 55-60     |   8 | [55,55,56,57,57,57,58,58]    |
| 35-40     |   7 | [35,35,36,36,37,38,39]       |
| 80-85     |   7 | [80,80,81,82,82,82,84]       |
| 30-35     |   6 | [31,32,32,32,33,34]          |
| 70-75     |   6 | [70,70,71,71,72,73]          |
| 65-70     |   6 | [65,67,67,68,68,69]          |
| 50-55     |   6 | [51,53,53,53,53,54]          |
| 45-50     |   5 | [45,45,48,48,49]             |
| 85-90     |   5 | [85,86,87,87,89]             |
| 75-80     |   5 | [76,77,78,79,79]             |
| 20-25     |   5 | [20,20,21,22,22]             |
| 15-20     |   5 | [17,17,17,18,19]             |
| 10-15     |   4 | [11,12,12,14]                |
| 95-100    |   4 | [95,95,96,99]                |
| 40-45     |   3 | [41,44,44]                   |
| 90-95     |   1 | [93]                         |
+-----------+-----+------------------------------+