我正在尝试编写运行Pig的EMR作业,写入DSE,我们将用它来提供服务。不幸的是,我无法让Pig写入DSE,因此我将问题分解为仅连接到DSE节点并尝试写入它。 这就是我正在做的事情
在Cassandra节点上:
cqlsh> CREATE KEYSPACE cql3ks WITH replication =
{'class': 'SimpleStrategy', 'replication_factor': 1 };
cqlsh> USE cql3ks
cqlsh:cql3ks> CREATE TABLE test (a int PRIMARY KEY, b int);
从本地计算机
export PIG_INITIAL_ADDRESS=<cassandra node IP>
export PIG_RPC_PORT=9160
export PIG_PARTITIONER=org.apache.cassandra.dht.Murmur3Partitioner
pig -x local
grunt> REGISTER /var/lib/cassandra/resources/cassandra/lib/libthrift-0.7.0.jar;
grunt> REGISTER /var/lib/cassandra/resources/cassandra/lib/cassandra-thrift-1.2.13.2.jar;
grunt> REGISTER /var/lib/cassandra/resources/cassandra/lib/cassandra-all-1.2.13.2.jar;
grunt> DEFINE CqlStorage org.apache.cassandra.hadoop.pig.CqlStorage();
grunt> moretestvalues= LOAD 'cql://cql3ks/test/' USING CqlStorage;
grunt> insertformat= FOREACH moretestvalues GENERATE TOTUPLE(TOTUPLE('a',a)),TOTUPLE(b);
grunt> STORE insertformat INTO 'cql://cql3ks/test?output_query=UPDATE+cql3ks.test+set+b+%3D+%3F' USING CqlStorage();
当我这样做时,我收到以下错误:
2014-02-25 18:50:27,952 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2014-02-25 18:50:28,506 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
2014-02-25 18:50:28,506 [main] WARN org.apache.pig.tools.grunt.Grunt - There is no log file to write to.
2014-02-25 18:50:28,506 [main] ERROR org.apache.pig.tools.grunt.Grunt - java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at org.apache.cassandra.hadoop.AbstractColumnFamilyOutputFormat.checkOutputSpecs(AbstractColumnFamilyOutputFormat.java:75)
at org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:80)
at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:66)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
at org.apache.pig.newplan.logical.rules.InputOutputFileValidator.validate(InputOutputFileValidator.java:45)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:288)
at org.apache.pig.PigServer.compilePp(PigServer.java:1322)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1247)
at org.apache.pig.PigServer.execute(PigServer.java:1239)
at org.apache.pig.PigServer.access$400(PigServer.java:121)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1553)
at org.apache.pig.PigServer.registerQuery(PigServer.java:516)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:991)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
at org.apache.pig.Main.run(Main.java:538)
at org.apache.pig.Main.main(Main.java:157)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
答案 0 :(得分:1)
这是版本问题。您可能正在使用hadoop 2.x而Cassandra库正在使用hadoop 1.x api。如果没有,请检查您是否使用了正确的罐子。
下一步Cassandra bugfix版本(2.0.6)将包含两个apis的兼容性,或至少这个issue这样说。