Spark ML模型保存但无法在集群模式下加载

时间:2019-11-14 22:34:27

标签: apache-spark apache-spark-ml spark-java

我在本地训练了Spark ML上的MLP模型,并使用model.save()将文件保存到本地文件系统。 保存变得轻而易举,创建的目录具有两个单独的数据和元数据目录,其中包含实木复合地板文件。

但是当我以星团模式将创建的模型导出到生产环境中并尝试加载模型时,会收到IllegalStateException:未读取的块数据

detail=IllegalStateException: unread block data; limitsExceeded=false org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, 10.1.0.118, executor 1): java.lang.IllegalStateException: unread block data\n  at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2783)\n   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1605)\n at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)\n   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)\n  at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)\n  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)\n at java.io.ObjectInputStream.readObject(ObjectInputStream.java:431)\n   at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)\n  at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)\n   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:370)\n  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n    at java.lang.Thread.run(Thread.java:748)\n\nDriver stacktrace:| at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1887) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]|  at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1875) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]|   at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1874) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]|   at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) ~[scala-library-2.11.12.jar:?]|   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) ~[scala-library-2.11.12.jar:?]|   at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1874) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]|    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]|   at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]|   at scala.Option.foreach(Option.scala:257) ~[scala-library-2.11.12.jar:?]|   at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:926) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]|    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2108) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]|   at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2057) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]| at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2046) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]| at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]|    at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:737) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]| at org.apache.spark.SparkContext.runJob(SparkContext.scala:2061) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]|  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2082) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]|  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2101) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]|  at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1364) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]| at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]| at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]| at org.apache.spark.rdd.RDD.withScope(RDD.scala:363) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]|  at org.apache.spark.rdd.RDD.take(RDD.scala:1337) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]|  at org.apache.spark.rdd.RDD$$anonfun$first$1.apply(RDD.scala:1378) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]|    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]| at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]| at org.apache.spark.rdd.RDD.withScope(RDD.scala:363) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]at org.apache.spark.rdd.RDD.first(RDD.scala:1377) ~[spark-core_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]|    at org.apache.spark.ml.util.DefaultParamsReader$.loadMetadata(ReadWrite.scala:615) ~[spark-mllib_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]|   at org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel$MultilayerPerceptronClassificationModelReader.load(MultilayerPerceptronClassifier.scala:363) ~[spark-mllib_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]|   at org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel$MultilayerPerceptronClassificationModelReader.load(MultilayerPerceptronClassifier.scala:356) ~[spark-mllib_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]|   at org.apache.spark.ml.util.MLReadable$class.load(ReadWrite.scala:380) ~[spark-mllib_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]|   at org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel$.load(MultilayerPerceptronClassifier.scala:337) ~[spark-mllib_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]|    at org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel.load(MultilayerPerceptronClassifier.scala) ~[spark-mllib_2.11-2.4.0-sfdc-0.5.jar:2.4.0-sfdc-0.5]|

我尝试搜索许多其他线程,他们似乎说spark或java版本可能不匹配。但是我怀疑这与spark ml如何以独立模式(不同于集群模式)保存模型的方式有所关系。有什么想法吗?

0 个答案:

没有答案