Question

我正在尝试实现多个映射器和reducer代码。这是我的主要方法

public static void main(String[] args) {
    //create a new configuration
    Configuration configuration = new Configuration();
    Path out = new Path(args[1]);
    try {
        //This is the first job to get the number of providers per state
        Job numberOfProvidersPerStateJob = Job.getInstance(configuration, "Total number of Providers per state");
        //Set the Jar file class, mapper and reducer class
        numberOfProvidersPerStateJob.setJarByClass(ProviderCount.class);
        numberOfProvidersPerStateJob.setMapperClass(MapForProviderCount.class);
        numberOfProvidersPerStateJob.setReducerClass(ReduceForProviderCount.class);

        numberOfProvidersPerStateJob.setOutputKeyClass(Text.class);
        numberOfProvidersPerStateJob.setOutputValueClass(IntWritable.class);
        //Provide the input and output argument this will be needed when running the jar file in hadoop
        FileInputFormat.addInputPath(numberOfProvidersPerStateJob, new Path(args[0]));
        FileOutputFormat.setOutputPath(numberOfProvidersPerStateJob, new Path(out,"out1"));
        if (!numberOfProvidersPerStateJob.waitForCompletion(true)) {
              System.exit(1);
            }
        //Job 2 for getting the state with maximum provider
        Job maxJobProviderState = Job.getInstance(configuration, "State With Max Job providers");
        //Set the Jar file class, mapper and reducer class
        maxJobProviderState.setJarByClass(ProviderCount.class);
        maxJobProviderState.setMapperClass(MapForMaxProvider.class);
        maxJobProviderState.setReducerClass(ReducerForMaxProvider.class);

        maxJobProviderState.setOutputKeyClass(IntWritable.class);
        maxJobProviderState.setOutputValueClass(Text.class);
        //Provide the input and output argument this will be needed when running the jar file in hadoop
        FileInputFormat.addInputPath(maxJobProviderState, new Path(out,"out1"));
        FileOutputFormat.setOutputPath(maxJobProviderState, new Path(out,"out2"));
        //Exit when results are ready
        System.exit(maxJobProviderState.waitForCompletion(true)?0:1);


    }

问题在于我每次运行它。它给出了第二个mapper类的最终输出，而不是reducer类。这就像我的第二减速器类被忽略了。

Answer 1

您可以使用（org.apache.hadoop.mapreduce.lib.chain.ChainMapper）和ChainReducers使用（org.apache.hadoop.mapreduce.lib.chain.ChainReducer）实现ChainMappers，它将解决您的问题。

使用多个映射器和缩减器进行Mapreduce

1 个答案: