我正在尝试将输入值拆分为3个部分,并将每个部分分配给单独的字符串并对其执行一些操作。但是我得到了ArrayIndexOutOfBound Exception,我无法弄清楚原因。
映射器:
public void map (Object Key, Text value,Context context )throws IOException,InterruptedException{
String text=value.toString();
String date =null;
String parts[]=tweet.split("\\t");
String sentence= parts[0].toString();
for(int i=0;i<parts.length;i++) {
System.out.println("part "+i+parts[i]);
}
if(parts.length>0){
date=parts[1];
}
word.set(date);
context.write(word, one);
}
堆栈追踪:
2015-07-31 16:50:50,288 INFO [Thread-11] mapred.LocalJobRunner (LocalJobRunner.java:run(397)) - Map task executor complete.
2015-07-31 16:50:50,295 WARN [Thread-11] mapred.LocalJobRunner (LocalJobRunner.java:run(482)) - job_local467783972_0001
java.lang.Exception: java.lang.ArrayIndexOutOfBoundsException: 1
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:403)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
*at sw$TweetMapper.map(sw.java:103)* --> points to date=parts[1]
at sw$TweetMapper.map(sw.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
部分[]长度大于0,但即使这样,分配也会导致错误。非常感谢。
答案 0 :(得分:2)
我认为您应该使用\t
作为分割中的标签文字,而不是\\t
,因为这会尝试拆分实际字符串"\t"
。因此:秒>
String parts[] = tweet.split("\t");
parts.length > 0
无法充分保护您免受越界异常的影响。正如您在案例中所看到的那样,仅仅因为长度大于零并不意味着parts[1]
处有一个元素;你应该检查长度是否大于1:
if(parts.length > 1) {
date = parts[1];
}