apache spark mlib从http源加载逻辑回归模型

时间:2018-11-23 22:04:52

标签: apache-spark model spark-streaming apache-spark-mllib

我想在extern http源上有一个模型,我想将其加载到我的spark流应用程序中以预测传入数据。由于数据来自不同的生产者,因此我必须根据数据加载单独的模型。

Spark在DCOS-mesos群集上运行,该模型是一个包含数据,元数据和镶木地板文件的文件夹。 无法直接从http加载,它需要“ file:...”

我试图将它们下载到“ ./testmodel/”路径中,它也降落在沙箱中,但由于路径错误而没有加载到模型中,但以下异常:线程“ main” org.apache中的异常。 hadoop.mapred.InvalidInputException:输入路径不存在:file:/ mnt / mesos / sandbox / testmodel / metadata

def fileDownloader(url: String, filename: String) = {
  new URL(url) #> new File(filename) !!
}...

val modelFolder: File = new File("./testModel");
modelFolder.mkdir();
val modelDataFolder: File = new File("./testModel/data")
modelDataFolder.mkdir();
val modelMetaDataFolder: File = new File("./testModel/metadata");
modelMetaDataFolder.mkdir();


fileDownloader("http://extern-host.com/sparkmodel/testmodel/data/._SUCCESS.crc", "./testModel/data/._SUCCESS.crc");
fileDownloader("http://extern-host.com/sparkmodel/testmodel/data/.part-00000-beca47f5-4fa8-4af8-ba76-21be4e0c4763-c000.snappy.parquet.crc", "./testModel/data/.part-00000-beca47f5-4fa8-4af8-ba76-21be4e0c4763-c000.snappy.parquet.crc");
fileDownloader("http://extern-host.com/sparkmodel/testmodel/data/_SUCCESS", "./testModel/data/_SUCCESS");
fileDownloader("http://extern-host.com/sparkmodel/testmodel/data/part-00000-beca47f5-4fa8-4af8-ba76-21be4e0c4763-c000.snappy.parquet", "./testModel/data/part-00000-beca47f5-4fa8-4af8-ba76-21be4e0c4763-c000.snappy.parquet");

fileDownloader("http://extern-host.com/sparkmodel/testmodel/metadata/._SUCCESS.crc", "./testModel/metadata/._SUCCESS.crc");
fileDownloader("http://extern-host.com/sparkmodel/testmodel/metadata/.part-00000.crc", "./testModel/metadata/.part-00000.crc");
fileDownloader("http://extern-host.com/sparkmodel/testmodel/metadata/_SUCCESS", "./testModel/metadata/_SUCCESS");
fileDownloader("http://extern-host.com/sparkmodel/testmodel/metadata/part-00000", "./testModel/metadata/part-00000");

val model = LogisticRegressionModel.load(sc, "./testModel");

感谢您的咨询

0 个答案:

没有答案