我已将输入数据分为train_df,test_df和val_df。我已经使用train_df数据训练了模型,并希望保存并加载它。
我的代码:
lr = LogisticRegression(maxIter=100)
lrModel = lr.fit(train_df)
predictions = lrModel.transform(val_df)
evaluator = BinaryClassificationEvaluator(rawPredictionCol="rawPrediction")
print("Prediction : \n")
print(evaluator.evaluate(predictions))
accuracy = predictions.filter(predictions.label == predictions.prediction).count() / float(val_set.count())
print("Accuracy : \n")
print(accuracy)
lrModel.write().save("/home/vijay18/spark-2.1.0-bin-hadoop2.7/python/lrModel")
model = LogisticRegressionModel()
model.load("/home/vijay18/spark-2.1.0-bin-hadoop2.7/python/lrModel")
这是我在终端上遇到的错误。错误的前三行用于保存模型,其余部分用于加载模型。
错误:
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
18/07/17 20:04:01 WARN ParquetRecordReader: Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
答案 0 :(得分:1)
load
不能在实例上调用。应该是
from pyspark.ml.classification import LogisticRegressionModel
LogisticRegressionModel.load(path)