我无法在spark集群中加载spark-nlp模型文件。尝试加载模型文件时,这就是我得到的-
Exception in thread "main" org.apache.spark.SparkException: addFile does not support local directories when not running local mode. at org.apache.spark.SparkContext.addFile(SparkContext.scala:1537) at com.johnsnowlabs.nlp.embeddings.SparkWordEmbeddings$.copyIndexToCluster(SparkWordEmbeddings.scala:86) at com.johnsnowlabs.nlp.embeddings.SparkWordEmbeddings$.apply(SparkWordEmbeddings.scala:111) at com.johnsnowlabs.nlp.HasWordEmbeddings$class.deserializeEmbeddings(HasWordEmbeddings.scala:57) at com.johnsnowlabs.nlp.annotators.ner.crf.NerCrfModel.deserializeEmbeddings(NerCrfModel.scala:19) at com.johnsnowlabs.nlp.embeddings.EmbeddingsReadable$class.readEmbeddings(EmbeddingsReadable.scala:8) at com.johnsnowlabs.nlp.annotators.ner.crf.NerCrfModel$.readEmbeddings(NerCrfModel.scala:84) at com.johnsnowlabs.nlp.embeddings.EmbeddingsReadable$$anonfun$1.apply(EmbeddingsReadable.scala:11) at com.johnsnowlabs.nlp.embeddings.EmbeddingsReadable$$anonfun$1.apply(EmbeddingsReadable.scala:11) at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable$$anonfun$com$johnsnowlabs$nlp$ParamsAndFeaturesReadable$$onRead$1.apply(ParamsAndFeaturesReadable.scala:31)
这就是我在代码中加载模型和管道文件夹的方式-
val pipeline = Pipeline.read.load("/opt/pipeline/")
val model = PipelineModel.read.load("/opt/model/")
我已经检查了模型文件是否存在于所有工作程序节点上的此路径中。而且第一行工作正常(Pipeline.read.load)。加载模型文件时,错误发生在第二行(PipelineModel.read.load)。
谢谢。