Question

我正在使用嵌入式Pig来实现图算法。它在本地模式下工作正常。但在完全分布式的Hadoop集群中，总会出现如下错误消息:(请参阅最后几行）

2012-11-23 22:00:00,651 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - creating jar file Job4116346741117365374.jar
2012-11-23 22:00:09,418 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - jar file Job4116346741117365374.jar created
2012-11-23 22:00:09,423 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up multi store job
2012-11-23 22:00:09,431 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=296
2012-11-23 22:00:09,431 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2012-11-23 22:00:09,442 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2012-11-23 22:00:09,949 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job null has failed! Stop running all dependent jobs
2012-11-23 22:00:09,949 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2012-11-23 22:00:09,992 [main] ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR 6015: During execution, encountered a Hadoop error.
2012-11-23 22:00:09,993 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2012-11-23 22:00:09,994 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:

HadoopVersion    PigVersion    UserId    StartedAt    FinishedAt Features
0.20.1    0.10.0    jierus    2012-11-23 21:52:38    2012-11-23 22:00:09    HASH_JOIN,GROUP_BY,DISTINCT,FILTER,UNION

Some jobs have failed! Stop running all dependent jobs
Failed Jobs:
JobId    Alias    Feature    Message    Outputs
N/A    vec_comp,vec_comp_final,vec_comp_tmp HASH_JOIN,MULTI_QUERY    Message: java.io.FileNotFoundException: File /tmp/Job4116346741117365374.jar does not exist.
    at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
    at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:192)
    at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1184)
    at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1160)
    at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1132)

有人知道我的代码或操作的哪一部分是错误的吗？

Answer 1

它闻起来像你没有指定猪的作业跟踪器（只指定hdfs是不够的！）。 e.g。

<property>
    <name>mapred.job.tracker</name>
    <value>10.xx.xx.99:9001</value>
</property>

关于Pig job Jar文件

1 个答案: