它显示它创建了缓存文件。但是,当我去查看文件不存在的位置时,当我尝试从我的映射器中读取时,它显示文件未找到异常。
这是我尝试运行的代码:
JobConf conf2 = new JobConf(getConf(), CorpusCalculator.class);
conf2.setJobName("CorpusCalculator2");
//Distributed Caching of the file emitted by the reducer2 is done here
conf2.addResource(new Path("/opt/hadoop1/conf/core-site.xml"));
conf2.addResource(new Path("/opt/hadoop1/conf/hdfs-site.xml"));
//cacheFile(conf2, new Path(outputPathofReducer2));
conf2.setNumReduceTasks(1);
//conf2.setOutputKeyComparatorClass()
conf2.setMapOutputKeyClass(FloatWritable.class);
conf2.setMapOutputValueClass(Text.class);
conf2.setOutputKeyClass(Text.class);
conf2.setOutputValueClass(Text.class);
conf2.setMapperClass(MapClass2.class);
conf2.setReducerClass(Reduce2.class);
FileInputFormat.setInputPaths(conf2, new Path(inputPathForMapper1));
FileOutputFormat.setOutputPath(conf2, new Path(outputPathofReducer3));
DistributedCache.addCacheFile(new Path("/sunilFiles/M51.txt").toUri(),conf2);
JobClient.runJob(conf
日志:
13/04/27 04:43:40 INFO filecache.TrackerDistributedCacheManager: Creating M51.txt in /tmp1/mapred/local/archive/-1731849462204707023_-2090562221_1263420527/localhost/sunilFiles-work-2204204368663038938 with rwxr-xr-x
13/04/27 04:43:40 INFO filecache.TrackerDistributedCacheManager: Cached /sunilFiles/M51.txt as /tmp1/mapred/local/archive/-1731849462204707023_-2090562221_1263420527/localhost/sunilFiles/M51.txt
13/04/27 04:43:40 INFO filecache.TrackerDistributedCacheManager: Cached /sunilFiles/M51.txt as /tmp1/mapred/local/archive/-1731849462204707023_-2090562221_1263420527/localhost/sunilFiles/M51.txt
13/04/27 04:43:40 INFO mapred.JobClient: Running job: job_local_0003
13/04/27 04:43:40 INFO mapred.Task: Using ResourceCalculatorPlugin : o
org.apache.hadoop.util.LinuxResourceCalculatorPlugin@8c2df1
13/04/27 04:43:40 INFO mapred.MapTask: numReduceTasks: 1
13/04/27 04:43:40 INFO mapred.MapTask: io.sort.mb = 100
13/04/27 04:43:40 INFO mapred.MapTask: data buffer = 79691776/99614720
13/04/27 04:43:40 INFO mapred.MapTask: record buffer = 262144/327680
在configure()
内:
Exception reading DistribtuedCache: java.io.FileNotFoundException: /tmp1/mapred/local/archive/-1731849462204707023_-2090562221_1263420527/localhost/sunilFiles/M51.txt (Is a directory)
Inside setup(): /tmp1/mapred/local/archive/-1731849462204707023_-2090562221_1263420527/localhost/sunilFiles/M51.txt
13/04/27 04:43:41 WARN mapred.LocalJobRunner: job_local_0003
请帮助我,我一直在寻找解决方案,持续6个小时,明天我有一份作业提交。非常感谢你。
答案 0 :(得分:0)
你可能想尝试更简单的-files选项。为了能够使用它,驱动程序类需要扩展Configured并实现Tool。
例如,
hadoop jar jarname.jar driverclass -files file1.xml,file2.txt
在mapper或reducer中:
BufferedReader reader1 = new BufferedReader(new FileReader("file1.xml"));
BufferedReader reader2 = new BufferedReader(new FileReader("file2.txt"));
答案 1 :(得分:0)
我通过使用copyMerge()属性解决了这个问题,该属性将各种机器中存在的所有文件合并到一个文件中,并且我成功地使用了该文件。如果我使用普通文件则失败。谢谢你们的回复。