不能序列化类mongodb spark

时间:2015-07-23 08:33:33

标签: java mongodb serialization apache-spark

在我的应用程序中,有一个模型类

import java.io.Serializable;
import java.util.Date;

public class Observation implements Serializable{

private static final long serialVersionUID = 1L;
...
}

我通过实现Serializable接口来序列化这个类。该类是MongoDB集合的模型。一切都可以将MongoDB记录映射到Observation对象。

当我运行我的应用程序时,Spark作业会执行map-reduce作业。在map reduce之后,我得到了这个异常。我添加了堆栈跟踪:

ERROR Executor: Exception in task ID 133
java.lang.IllegalArgumentException: can't serialize class    com.mongodb.spark.demo.Observation
at org.bson.BasicBSONEncoder._putObjectField(BasicBSONEncoder.java:284)
at org.bson.BasicBSONEncoder.putObject(BasicBSONEncoder.java:185)
at org.bson.BasicBSONEncoder.putObject(BasicBSONEncoder.java:131)
at com.mongodb.DefaultDBEncoder.writeObject(DefaultDBEncoder.java:33)
at com.mongodb.BSONBinaryWriter.encodeDocument(BSONBinaryWriter.java:339)
at com.mongodb.InsertCommandMessage.writeTheWrites(InsertCommandMessage.java:45)
at com.mongodb.InsertCommandMessage.writeTheWrites(InsertCommandMessage.java:23)
at com.mongodb.BaseWriteCommandMessage.encodeMessageBody(BaseWriteCommandMessage.java:69)
at com.mongodb.BaseWriteCommandMessage.encodeMessageBody(BaseWriteCommandMessage.java:23)
at com.mongodb.RequestMessage.encode(RequestMessage.java:66)
at com.mongodb.BaseWriteCommandMessage.encode(BaseWriteCommandMessage.java:53)
at com.mongodb.DBCollectionImpl.sendWriteCommandMessage(DBCollectionImpl.java:473)
at com.mongodb.DBCollectionImpl.writeWithCommandProtocol(DBCollectionImpl.java:427)
at com.mongodb.DBCollectionImpl.insertWithCommandProtocol(DBCollectionImpl.java:387)
at com.mongodb.DBCollectionImpl.insert(DBCollectionImpl.java:186)
at com.mongodb.DBCollectionImpl.insert(DBCollectionImpl.java:165)
at com.mongodb.DBCollection.insert(DBCollection.java:161)
at com.mongodb.DBCollection.insert(DBCollection.java:107)
at com.mongodb.DBCollection.save(DBCollection.java:966)
at com.mongodb.DBCollection.save(DBCollection.java:934)
at com.mongodb.hadoop.output.MongoRecordWriter.write(MongoRecordWriter.java:93)
at org.apache.spark.rdd.PairRDDFunctions.org$apache$spark$rdd$PairRDDFunctions$$writeShard$1(PairRDDFunctions.scala:716)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:730)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:730)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
at org.apache.spark.scheduler.Task.run(Task.scala:51)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

那么,为什么我得到这个例外?任何人都可以帮我找到解决方案吗?

2 个答案:

答案 0 :(得分:0)

也许,问题是因为你没有默认的构造函数

public Observation () {
}

答案 1 :(得分:0)

当我通过BasicDBObject扩展我的类时,问题就解决了。

因此,下面给出了最新和正确的类版本

public class Observation extends BasicDBObject implements Serializable{

private static final long serialVersionUID = 1L;
...
}