MapReduce程序的mapper类中的字符串连接给出错误

时间:2016-06-06 10:05:20

标签: mapreduce yarn hadoop2

在我的mapper类中,我想对从文件读取的字符串(作为一行)进行小操作,然后将其发送到reducer以获取字符串计数。操作是用0替换空字符串。(当前替换和加入部分失败了我的hadoop作业)

这是我的代码:

import java.io.BufferedReader;
import java.io.IOException;
.....

public class PartNumberMapper extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {
        private static Text partString = new Text("");

        private final static IntWritable count = new IntWritable(1);

        public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {

                String line = value.toString();

                // Read line by line to bufferreader and output the (line,count) pair
              BufferedReader bufReader = new BufferedReader(new StringReader(line));
              String l=null;
              while( (l=bufReader.readLine()) != null )
              {
                 /**** This part is the problem ****/
                  String a[]=l.split(",");
                  if(a[1]==""){  // if a[1] i.e. second string is "" then set it to "0"
                          a[1]="0";
                          l = StringUtils.join(",", a); // join the string array to form a string
                  }
                 /**** problematic part ends ****/

                        partString.set(l);
                        output.collect(partString, count);
              }

        }    
}

运行此操作后,映射器将失败并且不会发布任何错误。 [代码用纱线运行] 我不确定我做错了什么,相同的代码没有字符串连接部分。

你们有没有人解释字符串replace / concat有什么问题?有没有更好的方法呢?

1 个答案:

答案 0 :(得分:1)

这是Mapper类的修改版本,只有一些更改:

  1. 删除BufferedReader,它似乎是多余的,并且没有被关闭
  2. 字符串相等应为.equals()而不是==
  3. 使用String[]而非String a[]
  4. 声明字符串数组

    导致以下代码:

    public class PartNumberMapper extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {
    
            private Text partString = new Text();
            private final static IntWritable count = new IntWritable(1);
    
            public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
    
                    String line = value.toString();
                    String[] a = l.split(",");
    
                    if (a[1].equals("")) {
                        a[1] = "0";
                        l = StringUtils.join(",", a);
                    }
    
                    partString.set(l);
                    output.collect(partString, count);
            }    
    }