无法查询依赖于UDTF的Hive视图

时间:2017-06-08 18:26:23

标签: function view hive

我创建了一个表格如下:

CREATE TABLE TEST (ID INT, SCORE INT, NAME STRING);

并插入了几条记录。我想执行top-k查询,返回每个ID的最高记录,按SCORE排序。

我正在使用Hivemall库中的each_top_k()UDF,如下所示:https://hivemall.incubator.apache.org/userguide/misc/topk.html

SELECT EACH_TOP_K(1, ID, SCORE, ID, NAME) AS (RANK, SCORE, ID, NAME) FROM (
SELECT * FROM TEST
CLUSTER BY ID
) T;

成功返回每个ID的顶级SCORE。但是,我创建了一个视图,如下所示:

CREATE VIEW TEST_VIEW AS SELECT EACH_TOP_K(1, ID, SCORE, ID, NAME) AS (RANK, SCORE, ID, NAME) FROM (
SELECT * FROM TEST
CLUSTER BY ID
) T;

并且它成功执行。但是,那么简单

SELECT * FROM TEST_VIEW;

返回以下错误:

  

错误:编译语句时出错:FAILED:SemanticException View   test_view对应于UDTF,而不是SelectOperator。   (状态= 42000,代码= 40000)

我无法提及此错误。有什么建议吗?

1 个答案:

答案 0 :(得分:1)

我认为Hive在运行时为你的udtf推断每个字段的数据类型有问题。这应该解决它,尝试在查询之上提出查询,例如

CREATE VIEW TEST_VIEW AS 
select cast(rank as long) as rank, cast(score as double) as score, cast(id as string) as id, cast(name as string) as name from (
SELECT EACH_TOP_K(1, ID, SCORE, ID, NAME) AS (RANK, SCORE, ID, NAME) FROM (
SELECT * FROM TEST
CLUSTER BY ID
) T ) t2;