使用PIG将文件加载到Hbase

时间:2012-03-13 05:36:07

标签: hadoop hbase hdfs apache-pig

文件内容:

one,1
two,2
three,3

文件位置:hdfs:/hbasetest.txt

Hbase中的表格:

create 'mydata', 'mycf'

PIG脚本:

A = LOAD '/hbasetest.txt' USING PigStorage(',') as (strdata:chararray, intdata:long);
STORE A INTO 'hbase://mydata'
        USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
              'mycf:intdata');

我得到以下错误:

ON CONSOLE

2012-03-13 16:26:22,170 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2012-03-13 16:26:22,170 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - pig.usenewlogicalplan is set to true. New logical plan will be used.
2012-03-13 16:26:22,204 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org/apache/zookeeper/KeeperException

在日志文件中:

Pig Stack Trace
---------------
ERROR 2998: Unhandled internal error. org/apache/zookeeper/KeeperException

java.lang.NoClassDefFoundError: org/apache/zookeeper/KeeperException
    at org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:198)
    at org.apache.pig.backend.hadoop.hbase.HBaseStorage.getOutputFormat(HBaseStorage.java:389)
    at org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:87)
    at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:76)
    at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
    at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
    at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
    at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
    at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
    at org.apache.pig.newplan.logical.rules.InputOutputFileValidator.validate(InputOutputFileValidator.java:52)
    at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:292)
    at org.apache.pig.PigServer.compilePp(PigServer.java:1365)
    at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1207)
    at org.apache.pig.PigServer.execute(PigServer.java:1201)
    at org.apache.pig.PigServer.access$100(PigServer.java:129)
    at org.apache.pig.PigServer$Graph.execute(PigServer.java:1528)
    at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1575)
    at org.apache.pig.PigServer.registerQuery(PigServer.java:534)
    at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:871)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:388)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
    at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
    at org.apache.pig.Main.run(Main.java:455)
    at org.apache.pig.Main.main(Main.java:107)
Caused by: java.lang.ClassNotFoundException: org.apache.zookeeper.KeeperException
    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
    ... 25 more
============================================

如果我使用DUMP命令,它会在grunt上显示文件内容。 如何纠正这个问题??

2 个答案:

答案 0 :(得分:2)

hbase-*.jarzookeeper-3.3.2.ja r添加到PIG_HOMEPIG_HOME/lib..,同时将PIG_CLASSPATH=/HADOOP_HOME/confHADOOP_CONF_DIR=/HADOOP_HOME/confHADOOP_HOME设置为您的hadoop安装目录。

答案 1 :(得分:0)

在.bashrc中添加以下行

export PIG_CLASSPATH=$PIG_HOME/pig-x.x.x-withouthadoop.jar:$HBASE_HOME/hbase-x.x.x.jar:$HBASE_HOME/lib/*:$HADOOP_HOME/lib/*:$PIG_CLASSPATH