Scala代码未在SBT中编译

时间:2015-12-04 17:06:57

标签: scala apache-spark sbt

我写了一段机器学习代码,它在Scala shell上运行得很好。我正在使用SBT编译代码并创建JAR。我使用了示例中的一些代码(例如Spark中的LocalLR和SparkPI)来尝试编译新项目文件夹中的代码。它们都编译成功但由于某些原因我的代码没有编译。我遵循所有目录约定但仍未成功。

Action

以下错误

    import org.apache.spark.SparkContext
    import org.apache.spark.mllib.evaluation._
    import org.apache.spark.mllib.tree._
    import org.apache.spark.mllib.regression.LabeledPoint
    import org.apache.spark.mllib.linalg.Vectors
    import org.apache.spark.mllib.tree.model._
    import org.apache.spark.rdd._
    import org.apache.spark.mllib.util.MLUtils
    import org.apache.spark.mllib.classification.LogisticRegressionModel


    object PredictOOS {
    def getMetrics(model: DecisionTreeModel, data: RDD[LabeledPoint]):
        MulticlassMetrics = {
      val predictionsAndLabels = data.map(example =>
        (model.predict(example.features), example.label)
      )
      new MulticlassMetrics(predictionsAndLabels)
    }

    def main(args: Array[String]) {
        val conf = new SparkConf().setAppName("Predict OOS")
        val spark = new SparkContext(conf)

        val data = spark.textFile("D:/data/g1-svm.csv")
        val parsedData = data.map { line =>
        val parts = line.split(',').map(_.toDouble)
        LabeledPoint(parts(0), Vectors.dense(parts.tail))
        }
        val splits = parsedData.randomSplit(Array(0.8, 0.2), seed = 11L)
        val training = splits(0).cache()
        val test = splits(1)

        val model = DecisionTree.trainClassifier(training, 2, Map[Int,Int]    (), "gini", 20, 300)

        val metrics = getMetrics(model, test)

        println(" confusionMatrix is generated")
        spark.stop()
  }
}

请告知我是否遗漏了任何东西。我很长时间都被困在这个编译部分..非常感谢任何帮助

这是对原始帖子的修改。上面的代码编译成功但在我将输出写入文件时失败了。

D:\ScalaApps\sparklr>cd ../oos

D:\ScalaApps\oos>sbt
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; sup
port was removed in 8.0
[info] Set current project to Proj_oos (in build file:/D:/ScalaApps/oos/)
> compile
[info] Compiling 1 Scala source to D:\ScalaApps\oos\target\scala-2.11\classes...

[error] D:\ScalaApps\oos\src\main\scala\oos.scala:5: not found: type MulticlassM
etrics
[error]                     MulticlassMetrics = {
[error]                     ^
[error] D:\ScalaApps\oos\src\main\scala\oos.scala:4: not found: type DecisionTre
eModel
[error]                 def getMetrics(model: DecisionTreeModel, data: RDD[Label
edPoint]):
[error]                                       ^
[error] D:\ScalaApps\oos\src\main\scala\oos.scala:4: not found: type RDD
[error]                 def getMetrics(model: DecisionTreeModel, data: RDD[Label
edPoint]):
[error]                                                                ^
[error] D:\ScalaApps\oos\src\main\scala\oos.scala:9: not found: type MulticlassM
etrics
[error]                   new MulticlassMetrics(predictionsAndLabels)
[error]                       ^
[error] D:\ScalaApps\oos\src\main\scala\oos.scala:19: not found: value LabeledPo
int
[error]                         LabeledPoint(parts(0), Vectors.dense(parts.tail)
)
[error]                         ^
[error] D:\ScalaApps\oos\src\main\scala\oos.scala:19: not found: value Vectors
[error]                         LabeledPoint(parts(0), Vectors.dense(parts.tail)
)
[error]                                                ^
[error] D:\ScalaApps\oos\src\main\scala\oos.scala:25: not found: value DecisionT
ree
[error]                         val model = DecisionTree.trainClassifier(trainin
g, 2, Map[Int,Int](), "gini", 20, 300)
[error]                                     ^
[error] 7 errors found
[error] (compile:compileIncremental) Compilation failed
[error] Total time: 5 s, completed Dec 4, 2015 10:39:22 PM
>

错误

    metrics.confusionMatrix.saveAsTextFile("D:/spark4/confMatrix2")

我需要导入另一个包,以便saveAsTextFile工作吗?

2 个答案:

答案 0 :(得分:0)

您应该在build.sbt中添加以下依赖项:

libraryDependencies += "org.apache.spark" %% "spark-mllib" % "1.4.0"

在您的scala文件中添加以下导入:

import org.apache.spark.{SparkConf, SparkContext}

希望这有帮助

答案 1 :(得分:0)

我已经解决了这个问题。谢谢你的时间。

    metrics.confusionMatrix.saveAsTextFile("D:/spark4/confMatrix2")

即使在控制台上也可以使用。相反,我必须执行以下操作来保存结果。

        val res = metrics.confusionMatrix.toArray
        val res1 = spark.parallelize(res)
        res1.coalesce(1).saveAsTextFile("D:/spark4/confmatrix2")