我正在尝试使用Pig和Cassandra运行MapReduce作业,我总是得到错误: ERROR 2118:无法为以下内容创建输入拆分:cassandra://星座/日志
[解决] 我错过了一些环境变量:
PIG_RPC_PORT,PIG_INITIAL_ADDRESS, PIG_PARTITIONER
/opt/cassandra-0.7.0-beta3/contrib/pig$ bin/pig_cassandra example-script.pig
10/11/15 17:38:26 INFO pig.Main: Logging error messages to: /opt/cassandra-0.7.0-beta3/contrib/pig/pig_1289839106859.log
2010-11-15 17:38:27,809 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://hadoop-master-1.dkd.lan:8020
2010-11-15 17:38:29,756 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: hadoop-master-1.dkd.lan:8021
2010-11-15 17:38:32,753 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: Store(hdfs://hadoop-master-1.dkd.lan/tmp/temp657556636/tmp-375431593:org.apache.pig.builtin.BinStorage) - 1-82 Operator Key: 1-82)
2010-11-15 17:38:32,960 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.CombinerOptimizer - Choosing to move algebraic foreach to combiner
2010-11-15 17:38:33,100 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 3
2010-11-15 17:38:33,100 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 3
2010-11-15 17:38:33,364 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2010-11-15 17:38:38,771 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2010-11-15 17:38:38,999 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2010-11-15 17:38:39,055 [Thread-4] WARN org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
2010-11-15 17:38:39,500 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2010-11-15 17:38:40,340 [Thread-4] INFO org.apache.hadoop.mapred.JobClient - Cleaning up the staging area hdfs://hadoop-master-1.dkd.lan/var/lib/hadoop-0.20/cache/mapred/mapred/staging/dkd-sprenger/.staging/job_201011101636_0011
2010-11-15 17:38:40,356 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2010-11-15 17:38:40,357 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map reduce job(s) failed!
2010-11-15 17:38:40,402 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2010-11-15 17:38:40,517 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input splits for: cassandra://constellation/logs
Details at logfile: /opt/cassandra-0.7.0-beta3/contrib/pig/pig_1289839106859.log
任何有想法的人 - >解决了 我错过了设置它们的一些环境变量。
环境: Ubuntu Server 10.4
版本: hadoop:0.20 猪:0.7 cassandra:0.7.0 beta3
答案 0 :(得分:0)
提问者已经更新了问题以包含答案:
[已解决]我错过了一些环境变量:
PIG_RPC_PORT,PIG_INITIAL_ADDRESS,PIG_PARTITIONER