val pdata = sc.parallelize(Seq(data))
val parsedData = data.map { line => val parts = line.split(',') LabeledPoint(parts(0).toDouble, Vectors.dense(parts(1).split('').map(_.toDouble)))}.cache()
// Split the data into training and test sets (30% held out for testing)
val splits = parsedData.randomSplit(Array(0.7, 0.3))
val (trainingData, testData) = (splits(0), splits(1))
// Train a DecisionTree model.
val numClasses = 2
val categoricalFeaturesInfo = {}
val impurity = "gini"
val maxDepth = 5
val maxBins = 32
val model = DecisionTree.trainClassifier(trainingData, numClasses, categoricalFeaturesInfo, impurity, maxDepth, maxBins)
我编写了这段代码,用于在给定数据上构建决策树分类模型。第一列是预测列。 它抛出一个错误,指出“重载的方法值trainClassifier与替代品:”
这是我的示例输入数据:
1 2 50 12500 98
1 0 13 3250 28
1 1 16 4000 35
1 2 20 5000 45
0 1 24 6000 77
0 4 4 1000 4
1 2 7 1750 14
0 1 12 3000 35
1 2 9 2250 22
1 5 46 11500 98
0 4 23 5750 58