我试图将每个ID的表格折叠成一行,但在使用GROUP BY和CASE语句包括DATEDIFF函数时遇到了麻烦:
SELECT
o.id1
,o.id2
,count(case when o.type = 'TEST' and DATEDIFF(o.dte, m.dte) < 30 then id3 end) as win_30
FROM table1 m
LEFT JOIN table2 0
ON (m.id = o.id2)
WHERE o.load_dt BETWEEN '20181001' AND '20181010'
GROUP BY 1,2;
运行此代码时,我不断收到'Expression not in GROUP BY
'错误,而问题似乎出在datediff
上(当我取出'and DATEDIFF(o.dte, m.dte) < 30
'时,它只会运行精细)。我是否需要datediff
中的GROUP BY
?
感谢您的帮助。谢谢!
答案 0 :(得分:0)
对于类似的查询,我没有收到任何错误。
hive> select * from test_d1;
OK
1 2 10
3 4 20
5 6 30
hive> select * from test_d2;
OK
1 5
3 10
查询-hive> select t1.id1, t1.id2, count(case when t2.id3=1 and nvl(t1.dte,t2.dte) < 10 then 1 else 0 end) as col3 from test_d1 t1 left outer join test_d2 t2 on t1.id1=t2.id3 group by 1,2;
输出-
OK
1 2 1
3 4 1
5 6 1
尝试使用分组依据而不是列的位置(您必须设置set hive.groupby.orderby.position.alias = true )
hive> select t1.id1, t1.id2, count(case when t2.id3=1 and nvl(t1.dte,t2.dte) < 10 then 1 else 0 end) as col3 from test_d1 t1 left outer join test_d2 t2 on t1.id1=t2.id3 group by 1,2;
OK
1 2 1
3 4 1
5 6 1
另一个观察-选择列表中的列位于表的右侧时,为什么要进行左外部联接