为什么MapR在读取文件时会给我一个空指针?

时间:2014-11-16 05:30:50

标签: java hdfs apache-spark mapr

从mapr目录中读取文件时出现以下异常:

java.lang.NullPointerException
at com.mapr.fs.MapRFsInStream.read(MapRFsInStream.java:150)
at java.io.DataInputStream.read(DataInputStream.java:83)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:205)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:169)
at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:203)
at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:43)
at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:184)
at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:167)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:37)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:90)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:90)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:37)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:240)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
at org.apache.spark.scheduler.ShuffleMapTask.run(ShuffleMapTask.scala:149)
at org.apache.spark.scheduler.ShuffleMapTask.run(ShuffleMapTask.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:158)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)

当我在本地火花中断器上运行时,我没有例外。我的猜测是文件类型导致异常。知道造成这个NP的原因是什么?

1 个答案:

答案 0 :(得分:0)

您能否提供一些关于您试图在这里运行的内容的更多背景信息?涉及的组件版本等。

如果在完成Map / Reduce作业的实际输入数据读取和输出写入之前关闭了fileSystem对象,则通常会发生上述NPE。 Spark可能会尝试类似的东西。