无法在Spark中实例化BinaryClassificationMetrics类

时间:2018-12-19 21:28:43

标签: scala apache-spark apache-spark-mllib

我是第一次在Scala上使用Spark Mllib,但在实例化BinaryClassificationMetrics类时遇到了麻烦。即使我根据需要将其输入格式化为元组的RDD,它也会产生Cannot resolve constructor错误。任何想法可能出什么问题吗?

def modelEvaluation(model: PipelineModel, test: DataFrame): Unit = {
 // Make a prediction on the test set
    val predictionAndLabels = model.transform(test)
      .select("prediction","label")
      .rdd
      .map(r => (r(0),r(1)))
      /*.collect()
      .foreach(r => println(r))*/

    // Instantiate metrics object
    val metrics = new BinaryClassificationMetrics(predictionAndLabels)

    // Precision-Recall Curve
    //val PRC = metrics.pr
  }

1 个答案:

答案 0 :(得分:0)

BinaryClassificationMetrics需要RDD[(Double, Double)],详细信息:https://spark.apache.org/docs/2.4.0/api/scala/index.html#org.apache.spark.mllib.evaluation.BinaryClassificationMetrics

所以您可以这样更改:

def modelEvaluation(model: PipelineModel, test: DataFrame): Unit = {
  // Make a prediction on the test set
  val predictionAndLabels = model.transform(test)
    .select("prediction","label")
    .rdd
    .map(r => (r(0).toString.toDouble,r(1).toString.toDouble))

  // Instantiate metrics object
  val metrics = new BinaryClassificationMetrics(predictionAndLabels)

  // Precision-Recall Curve
  //val PRC = metrics.pr
}