我有一个随机森林模型,我试图获取featureImportance向量。
Map<Object, Object> categoricalFeaturesParam = new HashMap<>();
scala.collection.immutable.Map<Object, Object> categoricalFeatures = (scala.collection.immutable.Map<Object, Object>)
scala.collection.immutable.Map$.MODULE$.apply(JavaConversions.mapAsScalaMap(categoricalFeaturesParam).toSeq());
int numberOfClasses = 2;
RandomForestClassifier rfc = new RandomForestClassifier();
RandomForestClassificationModel rfm = RandomForestClassificationModel.fromOld(model, rfc, categoricalFeatures, numberOfClasses);
System.out.println(rfm.featureImportances());
当我运行上面的代码时,我发现featureImportance为null。我是否需要设置具体的内容以获得随机森林模型的特征重要性。
尝试使用1.6版本的Spark,它在API中使用了numberOfFeatures第五个参数,但仍然将featureImportance设为null。
RandomForestClassifier rfc = getRandomForestClassifier(numTrees,maxBinSize,maxTreeDepth,seed,impurity); RandomForestClassificationModel rfm = RandomForestClassificationModel.fromOld(model,rfc,categoricalFeatures,numberOfClasses,numberOfFeatures); 的System.out.println(rfm.featureImportances());
堆栈跟踪: 线程&#34; main&#34;中的例外情况显示java.lang.NullPointerException 在org.apache.spark.ml.tree.impl.RandomForest $ .computeFeatureImportance(RandomForest.scala:1152) 在org.apache.spark.ml.tree.impl.RandomForest $$ anonfun $ featureImportances $ 1.apply(RandomForest.scala:1111) 在org.apache.spark.ml.tree.impl.RandomForest $$ anonfun $ featureImportances $ 1.apply(RandomForest.scala:1108) 在scala.collection.IndexedSeqOptimized $ class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps $ ofRef.foreach(ArrayOps.scala:186) 在org.apache.spark.ml.tree.impl.RandomForest $ .featureImportances(RandomForest.scala:1108) at org.apache.spark.ml.classification.RandomForestClassificationModel.featureImportances $ lzycompute(RandomForestClassifier.scala:237) 在org.apache.spark.ml.classification.RandomForestClassificationModel.featureImportances(RandomForestClassifier.scala:237) 在com.markmonitor.antifraud.ce.ml.CheckFeatureImportance.main(CheckFeatureImportance.java:49)