发现:org.apache.spark.sql.Dataset [(Double,Double)]必需:org.apache.spark.rdd.RDD [(Double,Double)]

时间:2016-11-13 19:26:28

标签: scala apache-spark apache-spark-sql spark-dataframe rdd

我收到以下错误

 found   : org.apache.spark.sql.Dataset[(Double, Double)]
 required: org.apache.spark.rdd.RDD[(Double, Double)]
    val testMetrics = new BinaryClassificationMetrics(testScoreAndLabel)

在以下代码中:

val testScoreAndLabel = testResults.
    select("Label","ModelProbability").
    map{ case Row(l:Double,p:Vector) => (p(1),l) }
val testMetrics = new BinaryClassificationMetrics(testScoreAndLabel)

从错误来看,似乎testScoreAndLabel的类型为sql.Dataset,但BinaryClassificationMetrics需要RDD

如何将sql.Dataset转换为RDD

1 个答案:

答案 0 :(得分:1)

我做这样的事情

UIViewController

现在只需执行tableView

即可将val testScoreAndLabel = testResults. select("Label","ModelProbability"). map{ case Row(l:Double,p:Vector) => (p(1),l) } 转换为RDD
testScoreAndLabel

API Doc