我正在尝试使用ChiSqSelector来确定Spark 2.2 LSVCModel的最佳功能,因此:
import org.apache.spark.ml.feature.ChiSqSelector
val chiSelector = new ChiSqSelector().setNumTopFeatures(5).
setFeaturesCol("features").
setLabelCol("label").setOutputCol("selectedFeatures")
val pipeline = new Pipeline().setStages(Array(labelIndexer, monthIndexer, hashingTF
, idf, va, featureIndexer, chiSelector, lsvc, labelConverter))
val model = pipeline.fit(training)
val importantFeatures = model.selectedFeatures
import org.apache.spark.ml.classification.LinearSVCModel
val LSVCModel= model.stages(6).asInstanceOf[org.apache.spark.ml.classification.
LinearSVCModel]
val importantFeatures = LSVCModel.selectedFeatures
给出错误:
<console>:180: error: value selectedFeatures is not a member of
org.apache.spark.ml.classification.LinearSVCModel
val importantFeatures = LSVCModel.selectedFeatures
是否可以在此型号中使用ChiSqSelector?如果没有,还有其他选择吗?
答案 0 :(得分:0)
线性SVC不会进行任何功能选择。您应该从管道中提取ChiSqSelectorModel
,而不是LinearSVCModel
。
import org.apache.spark.ml.feature.ChiSqSelectorModel
val chiSqModel = model.stages(6).asInstanceOf[ChiSqSelectorModel]
val importantFeatures = chiSqModel.selectedFeatures