pyspark - LinearRegression.load()抛出NoSuchMethodException

时间:2018-02-08 19:49:07

标签: python apache-spark machine-learning pyspark

在尝试加载线性回归模型时,我得到了下面提到的错误:

Traceback (most recent call last):
  File "server.py", line 5, in <module>
    linReg = Model()
  File "/home/pyspark/Desktop/building_py_rec/lin_reg/ml_algo/model.py", line 23, in __init__
    self.model = LinearRegression.load('model_lin_reg')
  File "/home/pyspark/spark-2.1.0-bin-hadoop2.7/python/pyspark/ml/util.py", line 252, in load
    return cls.read().load(path)
  File "/home/pyspark/spark-2.1.0-bin-hadoop2.7/python/pyspark/ml/util.py", line 193, in load
    java_obj = self._jread.load(path)
  File "/home/pyspark/spark-2.1.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
  File "/home/pyspark/spark-2.1.0-bin-hadoop2.7/python/pyspark/sql/utils.py", line 63, in deco
    return f(*a, **kw)
  File "/home/pyspark/spark-2.1.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o24.load.
: java.lang.NoSuchMethodException: org.apache.spark.ml.regression.LinearRegressionModel.<init>(java.lang.String)
    at java.lang.Class.getConstructor0(Class.java:3082)
    at java.lang.Class.getConstructor(Class.java:1825)
    at org.apache.spark.ml.util.DefaultParamsReader.load(ReadWrite.scala:325)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:280)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:214)
    at java.lang.Thread.run(Thread.java:748)

相关的代码如下:

import findspark;
findspark.init()

import pyspark;
from pyspark.sql import SparkSession;
from pyspark.ml.regression import LinearRegression;
from pyspark.ml.feature import VectorAssembler;
from pyspark.sql.types import *;

X_Cols = ["Freq_Hz", "AoA_Deg", "Chord_m", "V_inf_mps", "displ_thick_m"]


class Model:

    spark = None;
    model = None;
    airfoil_assembler = None;

    def __init__(self):
        print('[Model] Creating spark session...');
        self.spark = SparkSession.builder.appName('lin_reg_reader').getOrCreate();
        print('[Model] Loading model...');      
        self.model = LinearRegression.load('model_lin_reg')
        print('[Model] Loading complete...');
        self.airfoil_assembler = VectorAssembler(inputCols=X_Cols, outputCol='features')
        return ;

    def _getSchema(self):
        schema = StructType({
            StructField("Freq_Hz", IntegerType(), False),
            StructField("AoA_Deg", IntegerType(), False),
            StructField("Chord_m", DoubleType(), False),
            StructField("V_inf_mps", DoubleType(), False),
            StructField("displ_thick_m", DoubleType(), False),
        });
        return schema

    def _prepare_df(self):
        schema = self._getSchema();
        return df;

    def assemble(self, tup):
        schema = self._getSchema();
        df = self.spark.createDataFrame(tup, schema)
        assembled_vector = self.airfoil_assembler.transform(df);
        return assembled_vector;

    def predict(self, airfoil):
        assembled_vector = self.assemble(tup=airfoil)
        return self.model.predict(assembled_vector)

注意

  • Spark版本:2.1.0
  • 参考此link表示它们应该是load函数。

1 个答案:

答案 0 :(得分:2)

您使用的是错误的课程。要加载模型,请使用LinearRegressionModel

from pyspark.ml.regression import LinearRegressionModel

LinearRegressionModel.load('model_lin_reg')