Apache Pig UDF解决问题

时间:2013-10-01 02:20:42

标签: java hadoop apache-pig

我有以下Java代码(返回测试的固定值):

Static.java

package com.company;

import java.io.IOException;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.Tuple;
import org.apache.pig.impl.util.WrappedIOException;

public class Static extends EvalFunc<String>
{
        public String exec(Tuple input) throws IOException
        {
                if (input == null || input.size() == 0)
                    return null;
                try
                {
                        String str = (String)input.get(0);
                        return "5876L";
                }
                catch (Exception e)
                {
                        throw WrappedIOException.wrap("Caught exception processing input row ", e);
                }
        }
}

使用

构建
javac -Xbootclasspath:/usr/lib/jvm/j2sdk1.6-oracle/jre/lib/rt.jar -source 1.6 -cp `hadoop classpath`:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/pig/pig-0.11.0-cdh4.3.0.jar Static.java

jar -cf Static.jar Static.class

Apache Pig工作

REGISTER /path/to/Static.jar;

loaded = LOAD 'data/' USING com.twitter.elephantbird.pig.store.LzoPigStorage() AS (line:chararray);

loaded = SAMPLE loaded 0.00001;

sized = FOREACH loaded GENERATE com.company.Static(line);

DUMP sized;

使用

运行
# pig -f static.pig -x local

哪些错误

[main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve com.company.Static using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]

更改行

sized = FOREACH loaded GENERATE com.company.Static(line);

sized = FOREACH loaded GENERATE Static(request);

错误

[main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. Static (wrong name: com/company/Static)

1 个答案:

答案 0 :(得分:3)

您似乎缺少jar中所需的目录结构。 com.company.Static(即Static.class文件)应位于jar中的com/company目录下。有关详细信息,请参阅此其他SO question

有关详细信息,请参阅我对this other question you posted的回复。