如何为DecisionTree增加maxMemoryinMB

时间:2015-08-12 12:43:53

标签: scala apache-spark

我正在尝试使用Scala在Spark中使用DecisionTree训练模型。

我的代码如下:

val numClasses = 19413
val categoricalFeaturesInfo = Map[Int, Int](5 -> 14)
val impurity = "gini"
val maxDepth = 5
val maxBins = 23000

val model = DecisionTree.trainClassifier(trainData, numClasses, categoricalFeaturesInfo, impurity, maxDepth, maxBins)

然而,当我运行它时,我得到一个IllegalArgumentException告诉我,我的最小maxMemoryinMB应该是8275.我试着查找如何增加这个数字但是没有找到任何结果。任何帮助将不胜感激!

亲切的问候

2 个答案:

答案 0 :(得分:0)

如果您像我一样使用Spark 1.3.1,这些代码可以帮助您:

val strategy = new Strategy( Algo.Classification, Gini , maxDepth1,
                             numClasses1, maxBins = maxBins1,
                             categoricalFeaturesInfo = categoricalFeaturesInfo1, 
                             maxMemoryInMB = 512)

val model1 = DecisionTree.train(trainingData, strategy)

答案 1 :(得分:0)

与spark 1.6.2存在同样的问题,解决方案是使用策略:

      import org.apache.spark.mllib.tree.configuration.Strategy        
      val s = Strategy.defaultStrategy("Classification")
      s.setMaxMemoryInMB(756)
      ... /* other settings */
      val model = DecisionTree.train(
        trainingVector,s)