我在Linux机器上进行了伪分布式Hadoop设置。我在eclipse中做了一些例子,它也安装在那台linux机器上,并且工作正常。现在我想通过eclipse(安装在windows机器上)执行MapReduce Jobs并访问我的linux机器中已经存在的HDFS。我写了以下驱动程序代码:
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
public class Windows_Driver extends Configured implements Tool{
public static void main(String[] args) throws Exception {
int exitcode = ToolRunner.run(new Windows_Driver(), args);
System.exit(exitcode);
}
@Override
public int run(String[] arg0) throws Exception {
JobConf conf = new JobConf(Windows_Driver.class);
conf.set("fs.defaultFS", "hdfs://<Ip address>:50070");
FileInputFormat.setInputPaths(conf, new Path("sample"));
FileOutputFormat.setOutputPath(conf, new Path("sam"));
conf.setMapperClass(Win_Mapper.class);
conf.setMapOutputKeyClass(Text.class);
conf.setMapOutputValueClass(Text.class);
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(Text.class);
JobClient.runJob(conf);
return 0;
}
}
&#13;
Mapper代码:
import java.io.IOException;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;
public class Win_Mapper extends MapReduceBase implements Mapper<LongWritable, Text,Text, Text> {
@Override
public void map(LongWritable key, Text value, OutputCollector<Text, Text> o, Reporter arg3) throws IOException {
...
o.collect(... , ...);
}
}
&#13;
当我运行它时,我收到以下错误:
SEVERE: PriviledgedActionException as:miracle cause:java.io.IOException: Failed to set permissions of path: \tmp\hadoop-miracle\mapred\staging\miracle1262421749\.staging to 0700
Exception in thread "main" java.io.IOException: Failed to set permissions of path: \tmp\hadoop-miracle\mapred\staging\miracle1262421749\.staging to 0700
at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:691)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:664)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:514)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:349)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:193)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:126)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:942)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1353)
at Windows_Driver.run(Windows_Driver.java:41)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at Windows_Driver.main(Windows_Driver.java:16)
&#13;
如何纠正错误?如何从Windows远程访问我的HDFS?
答案 0 :(得分:0)
submit()方法创建了一个内部的Jobsubmitter实例,它将执行所有数据验证,包括输入路径,输出路径可用性,文件/目录创建权限和其他内容。在MR的不同阶段,它将创建临时目录,在该目录下将放置临时值。文件。临时目录取自core-site.xml
,其属性为hadoop.tmp.dir
。你的系统的问题似乎是临时。 directory是/ tmp /,运行MR作业的用户无权将其rwx状态更改为700.提供适当的权限并重新运行作业。