Question

是否有可能为MLlib的随机森林选择组合策略？我找不到官方API文档的任何线索。

这是我的代码：

val numClasses = 10
val categoricalFeaturesInfo = Map[Int, Int]()
val numTrees = 10 
val featureSubsetStrategy = "auto" 
val impurity = "entropy"
val maxDepth = 2
val maxBins = 320

val model = RandomForest.trainClassifier(trainData, numClasses, categoricalFeaturesInfo,
  numTrees, featureSubsetStrategy, impurity, maxDepth, maxBins)

val predictionAndLabels = testData.map { case LabeledPoint(label, features) =>
  val prediction = model.predict(features)
  (prediction, label)
}

我知道预测方法（在treeEnsembleModels类中实现）考虑了组合策略（Sum，Average或Vote）：

def predict(features: Vector): Double = {
    (algo, combiningStrategy) match {
      case (Regression, Sum) =>
        predictBySumming(features)
      case (Regression, Average) =>
        predictBySumming(features) / sumWeights
      case (Classification, Sum) => // binary classification
        val prediction = predictBySumming(features)
        // TODO: predicted labels are +1 or -1 for GBT. Need a better way to store this info.
        if (prediction > 0.0) 1.0 else 0.0
      case (Classification, Vote) =>
        predictByVoting(features)
      case _ =>
        throw new IllegalArgumentException(
          "TreeEnsembleModel given unsupported (algo, combiningStrategy) combination: " +
        s"($algo, $combiningStrategy).")
    }
}

Answer 1

我说可以做的唯一方法就是在建立模型后使用反射。这必须是可能的，因为字段使用是延迟的（我还没有尝试运行这个代码，但是这样可以工作）。

RandomForestModel model = ...;
Class<?> c = model.getClass();
Field strategy = c.getDeclaredField("combiningStrategy");
strategy.set(model, whatever);

如何为MLlib的随机森林选择组合策略

1 个答案: