Impala:使用group by时选择带有条件的字段

时间:2016-08-30 21:00:26

标签: sql group-by impala

我有下表:

id   |  animal   |  timestamp   | team
---------------------------------------
 1   |  dog      | 2016-08-01   | blue
 2   |  cat      | 2016-08-02   | blue
 3   |  bird     | 2016-07-05   | red
 4   |  cow      | 2016-08-04   | red
 5   | snake     | 2016-08-12   | yellow

我想找到每个团队的动物,其标准是:如果一个团队有多个动物,我们将选择具有较晚时间戳的动物。这可能吗?谢谢!

2 个答案:

答案 0 :(得分:1)

典型方法使用row_number()

select t.*
from (select t.*,
             row_number() over (partition by team order by timestamp desc) as seqnum
      from t
     ) t
where seqnum = 1;

答案 1 :(得分:0)

您可以使用以下查询:

select * from teams t1 where `timestamp`=(select min(t2.`timestamp`) from teams t2 where t2.team = t1.team);

在实践中:

[localhost:21000] > create table teams(id int, animal string, `timestamp` timestamp, team string);
[localhost:21000] > insert into teams values (1, "dog", "2016-08-01", "blue"), (2, "cat", "2016-08-02", "blue"), (3, "bird", "2016-07-05", "red"), (4, "cow", "2016-08-04", "red"), (5, "snake", "2016-08-12", "yellow");
[localhost:21000] > select * from teams t1 where `timestamp`=(select min(t2.`timestamp`) from teams t2 where t2.team = t1.team);
+----+--------+---------------------+--------+
| id | animal | timestamp           | team   |
+----+--------+---------------------+--------+
| 1  | dog    | 2016-08-01 00:00:00 | blue   |
| 3  | bird   | 2016-07-05 00:00:00 | red    |
| 5  | snake  | 2016-08-12 00:00:00 | yellow |
+----+--------+---------------------+--------+