我估计使用管道进行逻辑回归。
在进行逻辑回归之前的最后几行:
from pyspark.ml.feature import VectorAssembler
from pyspark.ml.classification import LogisticRegression
lr = LogisticRegression(featuresCol="lr_features", labelCol = "targetvar")
# create assember to include encoded features
lr_assembler = VectorAssembler(inputCols= numericColumns +
[categoricalCol + "ClassVec" for categoricalCol in categoricalColumns],
outputCol = "lr_features")
from pyspark.ml.classification import LogisticRegression
from pyspark.ml import Pipeline
# Model definition:
lr = LogisticRegression(featuresCol = "lr_features", labelCol = "targetvar")
# Pipeline definition:
lr_pipeline = Pipeline(stages = indexStages + encodeStages +[lr_assembler, lr])
# Fit the logistic regression model:
lrModel = lr_pipeline.fit(train_train)
然后我尝试运行模型的摘要。但是,下面的代码行:
trainingSummary = lrModel.summary
导致:' PipelineModel'对象没有属性'摘要'
关于如何从管道模型中提取通常包含在回归模型中的摘要信息的任何建议?
非常感谢!
答案 0 :(得分:4)
从舞台上获取模型:
lrModel.stages[-1].summary
如果模型在Pipeline中较早,则用其索引替换-1。