我在hadoop中有一个文件:/home/hduser/IH/input/imageslocalpaths.txt(我已经检查过它是否使用hadoop fs -ls IH / input / imageslocalpaths.txt)。 我跑的时候:
hadoop jar IH.jar IH/input/imageslocalpaths.txt
我明白了:
Input path does not exist: hdfs://localhost:54310/user/hduser/IH%2Finput%2Fimageslocalpaths.txt
有谁能告诉我如何阻止Hadoop将斜线更改为%2F或其他解决方法?
(我已经尝试了完整的路径,但是hadoop只是将它添加到/ user / hduser的末尾,给出/ user / hduser / user / hduser ...仍然使用%2F)。
这里的请问是我的主要内容(你想要其他位吗?)
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Configuration conf2 = new Configuration();
conf.set("fs.defaultFS", "hdfs://localhost:54310");
String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
Job job1 = new Job(conf, "MergeImages");
job1.setJarByClass(ImageHandlerMain.class);
job1.setMapperClass(BinaryFilesToHadoopSequenceFileMapper.class);
job1.setOutputKeyClass(Text.class);
job1.setOutputValueClass(BytesWritable.class);
FileInputFormat.addInputPath(job1, new Path(URLEncoder.encode(otherArgs[0],"UTF-8")));
job1.setInputFormatClass(TextInputFormat.class);
FileOutputFormat.setOutputPath(job1, new Path(URLEncoder.encode(otherArgs[1],"UTF-8"))); //put result into intermediate folder
job1.setInputFormatClass(TextInputFormat.class);
job1.setOutputFormatClass(SequenceFileOutputFormat.class);
ControlledJob cJob1 = new ControlledJob(conf);
cJob1.setJob(job1);
Job job2 = new Job(conf2,"FindDuplicates");
job2.setJarByClass(ImageHandlerMain.class);
job2.setMapperClass(ImagePHashMapper.class);
job2.setReducerClass(ImageDupsReducer.class);
job2.setOutputKeyClass(Text.class);
job2.setOutputValueClass(Text.class);
FileInputFormat.addInputPath(job2, new Path(URLEncoder.encode(otherArgs[1],"UTF-8") + "/part-r-00000")); //get the part-r-00000 file from the intermediate folder
FileOutputFormat.setOutputPath(job2, new Path(otherArgs[2])); //put result into output folder
job2.setInputFormatClass(SequenceFileInputFormat.class);
ControlledJob cJob2 = new ControlledJob(conf2);
cJob2.setJob(job2);
JobControl jobctrl = new JobControl("jobctrl");
jobctrl.addJob(cJob1);
jobctrl.addJob(cJob2);
cJob2.addDependingJob(cJob1);
jobctrl.run();
}
答案 0 :(得分:1)
问题出在这行代码中
FileInputFormat.addInputPath(job2, new Path(URLEncoder.encode(otherArgs[1],"UTF-8") + "/part-r-00000")); //get the part-r-00000 file from the intermediate folder
在这里,当您在创建路径时使用URLEncoder.encode时,它正在转换" /"到%2F。
可能的解决方案
FileInputFormat.addInputPath(job2, new Path(URLEncoder.encode(otherArgs[1],"UTF-8").replace("%2F", "/") + "/part-r-00000")); //get the part-r-00000 file from the intermediate folder
编码后只需替换"%2F"用替换方法回到" /"。
答案 1 :(得分:0)
我不确定问题可能来自哪里,但请尝试检查以下内容:
String [] otherArgs = new GenericOptionsParser(conf,args).getRemainingArgs();
FileInputFormat.setInputPaths(job,new Path(inputLocation)); //其中inputLocation只是一个String