Datediff与CASE和分组依据

时间:2019-01-23 21:41:48

标签: hive

我试图将每个ID的表格折叠成一行,但在使用GROUP BY和CASE语句包括DATEDIFF函数时遇到了麻烦:

SELECT
o.id1
,o.id2
,count(case when o.type = 'TEST' and DATEDIFF(o.dte, m.dte) < 30 then id3 end) as win_30

FROM table1 m 
LEFT JOIN table2 0
ON (m.id = o.id2)
WHERE o.load_dt BETWEEN '20181001' AND '20181010'
GROUP BY 1,2;

运行此代码时,我不断收到'Expression not in GROUP BY'错误,而问题似乎出在datediff上(当我取出'and DATEDIFF(o.dte, m.dte) < 30'时,它只会运行精细)。我是否需要datediff中的GROUP BY

感谢您的帮助。谢谢!

1 个答案:

答案 0 :(得分:0)

对于类似的查询,我没有收到任何错误。

hive> select * from test_d1;
OK
1       2       10
3       4       20
5       6       30

hive> select * from test_d2;
OK
1       5
3       10

查询-hive> select t1.id1, t1.id2, count(case when t2.id3=1 and nvl(t1.dte,t2.dte) < 10 then 1 else 0 end) as col3 from test_d1 t1 left outer join test_d2 t2 on t1.id1=t2.id3 group by 1,2;

输出-

OK
1       2       1
3       4       1
5       6       1

尝试使用分组依据而不是列的位置(您必须设置set hive.groupby.orderby.position.alias = true

hive> select t1.id1, t1.id2, count(case when t2.id3=1 and nvl(t1.dte,t2.dte) < 10 then 1 else 0 end) as col3 from test_d1 t1 left outer join test_d2 t2 on t1.id1=t2.id3 group by 1,2;
OK
1       2       1
3       4       1
5       6       1

另一个观察-选择列表中的列位于表的右侧时,为什么要进行左外部联接