Hadoop V2.7和Eclipse

时间:2015-07-04 23:29:43

标签: eclipse hadoop

我在我的mac中设置了Hadoop v2.7,我可以启动Hadoop守护进程。

我想用eclipse编写MR程序,我需要一些帮助来获取我的eclipse上的hadoop,我想知道要添加的jar文件和基本设置指南

以下是我的驱动程序类代码,我无法执行它

public class MyJobDriver extends Configured implements Tool {



    @Override
    public int run(String[] args) throws Exception {
        Configuration conf = getConf();
        JobConf job = new JobConf(conf, MyJobDriver.class);

        Path in = new Path(args[0]);
        Path out = new Path(args[1]);
        FileInputFormat.setInputPaths(job, in);
        FileOutputFormat.setOutputPath(job, out);

        job.setJobName("Patent");

        job.setMapperClass(InverseMapper.class);
        //Input Split consist two values separated by ","
        //K1 and V1 type is Text
        job.setInputFormat(KeyValueTextInputFormat.class);  
        job.set("key.value.separator.in.input.line",",");//Everything before the separator is the key and after is the value

        job.setOutputFormat(TextOutputFormat.class);//Key and value written as string and separated by tab(default)
        //when k1 and k2 are od same type and V1 and V2 are of same type
        //we can skip job.setMapOutputKeyClass() and job.setMapOutputValueClass()
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(Text.class);

        //jobClient communicates with the JobTrackers to start job across clusters
        JobClient.runJob(job);
        return 0;
    }
        public static void main(String[] args) throws Exception {
            MyJobDriver driver = new MyJobDriver();
            System.out.println("Calling the run method");
             int exitCode = ToolRunner.run(driver, args);
             System.exit(exitCode);
    }

1 个答案:

答案 0 :(得分:0)

跟踪和检索必要的jar文件(有很多)是太麻烦了。而是在eclipse中创建一个maven项目,并添加必要的依赖项,如https://hadoopi.wordpress.com/2013/05/25/setup-maven-project-for-hadoop-in-5mn/

所述