应用错误收集

是的，在hadoop中，您可以使用MultipleOutputFormat方法使用generateFileNameForKeyValue方法完成该操作。

使用您的国家/地区名称作为键，将记录用作值，这应该完全符合您的需要。

如果您使用的是新API，则应查看MultipleOutputs类。这个课程中有一个例子。

作业提交的使用模式：




    Job job = new Job();

    FileInputFormat.setInputPath(job, inDir);
    FileOutputFormat.setOutputPath(job, outDir);

    job.setMapperClass(MOMap.class);
    job.setReducerClass(MOReduce.class);
    ...

    // Defines additional single text based output 'text' for the job
    MultipleOutputs.addNamedOutput(job, "text", TextOutputFormat.class,
    LongWritable.class, Text.class);

    // Defines additional sequence-file based output 'sequence' for the job
    MultipleOutputs.addNamedOutput(job, "seq",
      SequenceFileOutputFormat.class,
      LongWritable.class, Text.class);
    ...

    job.waitForCompletion(true);
    ...

减速器中的用法：



    String generateFileName(K k, V v) {
       return k.toString() + "_" + v.toString();
    }

    public class MOReduce extends
       Reducer {
         private MultipleOutputs mos;
         public void setup(Context context) {
          ...
              mos = new MultipleOutputs(context);
          }

         public void reduce(WritableComparable key, Iterator values,
                Context context)
                throws IOException {
          ...
     mos.write("text", , key, new Text("Hello"));
     mos.write("seq", LongWritable(1), new Text("Bye"), "seq_a");
     mos.write("seq", LongWritable(2), key, new Text("Chau"), "seq_b");
     mos.write(key, new Text("value"), generateFileName(key, new Text("value")));
     ...
      }

    public void cleanup(Context) throws IOException {
         mos.close();
     ...
    }
   }

是否可以为map-reduce提供多个输出文件？

2 个答案: