我在我的mac中设置了Hadoop v2.7,我可以启动Hadoop守护进程。
我想用eclipse编写MR程序,我需要一些帮助来获取我的eclipse上的hadoop,我想知道要添加的jar文件和基本设置指南
以下是我的驱动程序类代码,我无法执行它
public class MyJobDriver extends Configured implements Tool {
@Override
public int run(String[] args) throws Exception {
Configuration conf = getConf();
JobConf job = new JobConf(conf, MyJobDriver.class);
Path in = new Path(args[0]);
Path out = new Path(args[1]);
FileInputFormat.setInputPaths(job, in);
FileOutputFormat.setOutputPath(job, out);
job.setJobName("Patent");
job.setMapperClass(InverseMapper.class);
//Input Split consist two values separated by ","
//K1 and V1 type is Text
job.setInputFormat(KeyValueTextInputFormat.class);
job.set("key.value.separator.in.input.line",",");//Everything before the separator is the key and after is the value
job.setOutputFormat(TextOutputFormat.class);//Key and value written as string and separated by tab(default)
//when k1 and k2 are od same type and V1 and V2 are of same type
//we can skip job.setMapOutputKeyClass() and job.setMapOutputValueClass()
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
//jobClient communicates with the JobTrackers to start job across clusters
JobClient.runJob(job);
return 0;
}
public static void main(String[] args) throws Exception {
MyJobDriver driver = new MyJobDriver();
System.out.println("Calling the run method");
int exitCode = ToolRunner.run(driver, args);
System.exit(exitCode);
}
答案 0 :(得分:0)
跟踪和检索必要的jar文件(有很多)是太麻烦了。而是在eclipse中创建一个maven项目,并添加必要的依赖项,如https://hadoopi.wordpress.com/2013/05/25/setup-maven-project-for-hadoop-in-5mn/
所述