我想使用'<h1>Not Found</h1><p>The requested resource was not found on this server.</p>'
端cross-validation
库来获得PySpark
的(内部)训练准确性:
ML
为了获取每个lr = LogisticRegression()
param_grid = (ParamGridBuilder()
.addGrid(lr.regParam, [0.01, 0.5])
.addGrid(lr.maxIter, [5, 10])
.addGrid(lr.elasticNetParam, [0.01, 0.1])
.build())
evaluator = MulticlassClassificationEvaluator(predictionCol='prediction')
cv = CrossValidator(estimator=lr,
estimatorParamMaps=param_grid,
evaluator=evaluator,
numFolds=5)
model_cv = cv.fit(train)
predictions_lr = model_cv.transform(validation)
predictions = evaluator.evaluate(predictions_lr)
文件夹的准确性指标,我尝试:
c.v.
,但是此方法的结果为空(print(model_cv.subModels)
)。
如何获取每个文件夹的None
?
答案 0 :(得分:1)
我知道这已经很老了,但万一有人在交叉验证过程中寻找火花保存非最佳模型,我需要在创建CrossValidator
时启用子模型集合。只需将值设置为True(默认情况下为False)即可。
即
CrossValidator(estimator=lr,
estimatorParamMaps=param_grid,
evaluator=evaluator,
numFolds=5,
collectSubModels=True)