如何使用多个参数在hive中编写UDAF

时间:2014-09-25 11:22:08

标签: java hadoop hive

我想在hive中编写一个UDAF,它可以使用多个参数。 下面是几个类代码体系结构,其中我删除了方法中的实际逻辑,因为它只占用空间

public class CountWithRating extends UDAF{

    public static class MovieCountWithGivenRating implements UDAFEvaluator{
        public static class PartialClass{
            Set users;
            Map movieMap; // contains movieIds , number of users which give the eq or more then given rating
        }
        private PartialClass partial;

        public void init(){
            partial = null;
        }

        public boolean iterate(IntWritable userId,IntWritable movieId,Text rating,DoubleWritable static_rating){

        }

        public PartialClass terminatePartial(){
            return partial;
        }


        public boolean merge(PartialClass other){
            return true;
        }


        public IntWritable terminate(){

        }

    }

}

在此我在hive中创建函数,如下所示

CREATE TEMPORARY FUNCTION MaxMovieCount AS 'hive.udaf.CountWithRating';

并调用以下函数

select MaxMovieCount(userId,movieId,rating,4.0) from rating;

但是它会产生如下错误

FAILED: NoMatchingMethodException No matching method for class hive.udaf.CountWithRating with (int, int, string, double). Possible choices: _FUNC_(int, int, string, struct)

0 个答案:

没有答案