Hive:在一个组中查找最大值

时间:2012-03-29 10:29:57

标签: sql hive max

我有一个像这样的蜂巢表:

create external table test(
  test_id string,
  test_name string,
  description string,
  clicks int,
  last_referred_click_date string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE LOCATION  '{some_location}';

我需要找出test_id的总点击次数和最后点击日期(该组test_id中的最长日期)

我正在做这样的事情

insert overwrite table test partition(weekending='{input_date}')
  select s.test_id,s.test_name,s.description,max(click_date),
    sum(t.click) as clicks
   group by s.test_id,s.test_name,s.description order by clicks desc; 

max()函数是否适用于字符串?我的click_date格式为'yyyy-mm-dd'并且是字符串数据类型?如果没有,我可以在这做什么? UDF?

1 个答案:

答案 0 :(得分:2)

SELECT s.test_id,
       s.test_name,
       s.description,
       MAX(CAST(last_referred_click_date as DateTime)), 
       sum(t.clicks) as Total_Clicks
FROM test s
WHERE s.test_id=1
GROUP BY s.test_id,s.test_name,s.description 
ORDER BY clicks desc;