在我的map函数中,我试图从distributedcache中读取一个文件,将其内容加载到哈希映射中。
MapReduce作业的sys输出日志打印hashmap的内容。这表明它已经找到了文件,已经加载到数据结构中并执行了所需的操作。它遍历列表并打印其内容。从而证明操作是成功的。
但是,在运行MR作业几分钟后,我仍然会收到以下错误:
13/01/27 18:44:21 INFO mapred.JobClient: Task Id : attempt_201301271841_0001_m_000001_2, Status : FAILED java.io.FileNotFoundException: File does not exist: /app/hadoop/jobs/nw_single_pred_in/predict at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1843) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.(DFSClient.java:1834) at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:578) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:154) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:67) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:522) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249)
这是初始化Path的部分,其中包含要放置在分布式缓存中的文件的位置
// inside main, surrounded by try catch block, yet no exception thrown here Configuration conf = new Configuration(); // rest of the stuff that relates to conf Path knowledgefilepath = new Path(args[3]); // args[3] = /app/hadoop/jobs/nw_single_pred_in/predict/knowledge.txt DistributedCache.addCacheFile(knowledgefilepath.toUri(), conf); job.setJarByClass(NBprediction.class); // rest of job settings job.waitForCompletion(true); // kick off load
这个位于地图功能中:
try { System.out.println("Inside try !!"); Path files[]= DistributedCache.getLocalCacheFiles(context.getConfiguration()); Path cfile = new Path(files[0].toString()); // only one file System.out.println("File path : "+cfile.toString()); CSVReader reader = new CSVReader(new FileReader(cfile.toString()),'\t'); while ((nline=reader.readNext())!=null) data.put(nline[0],Double.parseDouble(nline[1])); // load into a hashmap } catch (Exception e) {// handle exception }
帮助表示赞赏。
干杯!
答案 0 :(得分:0)
重新安装hadoop并用同一个罐子运行工作,问题就消失了。似乎是一个错误,而不是编程错误。