我正在尝试使用分布[6,6]在SCALA中计算de kolmogorov Smirnov检验,假设在原始值中所有值都相同P(6)=1。这就是我正在尝试的方式到:
val data: RDD[Double] = sc.parallelize(Seq(6, 6))
val myCDF = Map(6 -> 1)
val testResult2 = Statistics.kolmogorovSmirnovTest(data, myCDF)
println(testResult2)
这是我得到的错误:
notebook:3: error: overloaded method value kolmogorovSmirnovTest with alternatives: (data: org.apache.spark.api.java.JavaDoubleRDD,distName: String,params: Double*)org.apache.spark.mllib.stat.test.KolmogorovSmirnovTestResult <and> (data: org.apache.spark.rdd.RDD[Double],distName: String,params: Double*)org.apache.spark.mllib.stat.test.KolmogorovSmirnovTestResult <and> (data: org.apache.spark.rdd.RDD[Double],cdf: Double => Double)org.apache.spark.mllib.stat.test.KolmogorovSmirnovTestResult cannot be applied to (org.apache.spark.rdd.RDD[Double], scala.collection.immutable.Map[Int,Int]) val testResult2 = Statistics.kolmogorovSmirnovTest(data, myCDF)
有人知道为什么它不起作用吗? 另外,您知道是否可以在PySpark中做到吗?还是我必须从PySpark执行SCALA代码? 谢谢!