Question

我是第一次在Scala上使用Spark Mllib，但在实例化BinaryClassificationMetrics类时遇到了麻烦。即使我根据需要将其输入格式化为元组的RDD，它也会产生Cannot resolve constructor错误。任何想法可能出什么问题吗？

def modelEvaluation(model: PipelineModel, test: DataFrame): Unit = {
 // Make a prediction on the test set
    val predictionAndLabels = model.transform(test)
      .select("prediction","label")
      .rdd
      .map(r => (r(0),r(1)))
      /*.collect()
      .foreach(r => println(r))*/

    // Instantiate metrics object
    val metrics = new BinaryClassificationMetrics(predictionAndLabels)

    // Precision-Recall Curve
    //val PRC = metrics.pr
  }

Answer 1

BinaryClassificationMetrics需要RDD[(Double, Double)]，详细信息：https://spark.apache.org/docs/2.4.0/api/scala/index.html#org.apache.spark.mllib.evaluation.BinaryClassificationMetrics

所以您可以这样更改：

def modelEvaluation(model: PipelineModel, test: DataFrame): Unit = {
  // Make a prediction on the test set
  val predictionAndLabels = model.transform(test)
    .select("prediction","label")
    .rdd
    .map(r => (r(0).toString.toDouble,r(1).toString.toDouble))

  // Instantiate metrics object
  val metrics = new BinaryClassificationMetrics(predictionAndLabels)

  // Precision-Recall Curve
  //val PRC = metrics.pr
}

无法在Spark中实例化BinaryClassificationMetrics类

1 个答案: