我正在尝试使用Hortonworks Sandbox中的mapreduce2(yarn)拆分字符串。 如果我尝试访问val [1]它会抛出一个ArrayOutOfBound异常,当我不分割输入文件时工作正常。
映射器:
public class MapperClass extends Mapper<Object, Text, Text, Text> {
private Text airline_id;
private Text name;
private Text country;
private Text value1;
public void map(Object key, Text value, Context context)
throws IOException, InterruptedException {
String s = value.toString();
if (s.length() > 1) {
String val[] = s.split(",");
context.write(new Text("blah"), new Text(val[1]));
}
}
}
减速机:
public class ReducerClass extends Reducer<Text, Text, Text, Text> {
private Text result = new Text();
public void reduce(Text key, Iterable<Text> values, Context context)
throws IOException, InterruptedException {
String airports = "";
if (key.equals("India")) {
for (Text val : values) {
airports += "\t" + val.toString();
}
result.set(airports);
context.write(key, result);
}
}
}
MainClass:
public class MainClass {
public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
Configuration conf = new Configuration();
@SuppressWarnings("deprecation")
Job job = new Job(conf, "Flights MR");
job.setJarByClass(MainClass.class);
job.setMapperClass(MapperClass.class);
job.setReducerClass(ReducerClass.class);
job.setNumReduceTasks(0);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
job.setInputFormatClass(KeyValueTextInputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
你能帮忙吗?
更新
想通了它没有将Text转换为String。
答案 0 :(得分:0)
如果您要拆分的字符串不包含逗号,则生成的String []的长度为1,整个字符串位于val [0]。
目前,您确保该字符串不是空字符串
if (s.length() > -1)
但是你没有检查分割是否会实际导致长度大于1的数组,并假设存在分裂。
context.write(new Text("blah"), new Text(val[1]));
如果没有拆分,这将导致越界错误。一个可能的解决方案是确保字符串包含至少1个逗号,而不是像这样检查它是不是空字符串:
String s = value.toString();
if (s.indexOf(',') > -1) {
String val[] = s.split(",");
context.write(new Text("blah"), new Text(val[1]));
}