我正在尝试使用以下方法来实现决策树:https://spark.apache.org/docs/latest/mllib-decision-tree.html#examples 我的示例代码是:
val splits = predictionsNewdfNew.randomSplit(Array(0.7, 0.3))
val (trainingData, testData) = (splits(0), splits(1))
val numClasses = 3
val categoricalFeaturesInfo = Map[Int, Int]()
val impurity = "gini"
val maxDepth = 5
val maxBins = 32
val model2 = DecisionTree.trainClassifier(trainingData, numClasses, categoricalFeaturesInfo, impurity, maxDepth, maxBins)
'predictionsNewdfNew'是一个带有示例行的数据框,例如:
|Apn5Q_b6Nz61Tq4Xz...|51.0918130155|-114.031674872| 3| 24|4.0| good|
|AjEbIBw6ZFfln7ePH...| 35.9607337| -114.939821| 3| 3|4.5|satisfactory|
|bFzdJJ3wp3PZssNEs...| 33.4499993| -112.0769793| 7| 8|1.5| bad|
最后一列是标签。 错误是:
overloaded method value trainClassifier with alternatives:
(input: org.apache.spark.api.java.JavaRDD[org.apache.spark.mllib.regression.LabeledPoint],
numClasses: Int,
categoricalFeaturesInfo: java.util.Map[Integer,Integer],
impurity: String,
maxDepth: Int,
maxBins: Int)org.apache.spark.mllib.tree.model.DecisionTreeModel
<and> (input: org.apache.spark.rdd.RDD[org.apache.spark.mllib.regression.LabeledPoint],
numClasses: Int,
categoricalFeaturesInfo: Map[Int,Int],
impurity: String,
maxDepth: Int,
maxBins: Int)
org.apache.spark.mllib.tree.model.DecisionTreeModel cannot be applied to (org.apache.spark.sql.Dataset[org.apache.spark.sql.Row], String, Int, Int)
有人可以帮助我了解此处语法的错误之处。
谢谢。