Question

我的Hadoop程序读取数千行（.log）文件并解析它们。之后我用Output.Collect（）函数将它们写入文件。但是在每个reducer输出之后它将输出写入newline。我怎么能把所有内容写在同一行，如何确保我的输出文件只有一行？

Reducer Class

 public class Reduce extends MapReduceBase
    implements Reducer<Text, Text, Text, Text> {

    public void reduce(Text key, Iterator<Text> values,OutputCollector<Text, Text> output,
            Reporter reporter) throws IOException {
        Text t2 = new Text("");
      output.collect(key, t2);
    }
  }

Mapper Class

    public void map(LongWritable key, Text value, 
                        OutputCollector<Text, Text> output, 
                        Reporter reporter) throws IOException {



    Path("xxxx");//Location of file in HDFS
InputStreamReader(fs.open(pt)));

                String line = value.toString();

                String bePublished="";
               String patternString = "xxxx;
               Pattern pattern = Pattern.compile(patternString);
               Matcher matcher = pattern.matcher(line);


                  for(int u=0;u<48;u++){
       //Here i update my bepublished string



                  }

                 Text t2 = new Text("");
                 output.collect(new Text(bePublished), t2); 

        }

Hadoop - 从reducer输出中删除换行符

0 个答案: