在主类中调用一个对象

时间:2018-02-21 17:05:03

标签: scala apache-spark

关于调用对象并从main运行它的快速问题只需要从主类调用的最后一个命令val op = df.coalesce(1).write.mode("overwrite").format("csv").save("report")

我们如何利用对象的实例并运行以写入csv?

  def run(implicit context: Context): Unit = {
      val timer = Timer.start()


      //not working
      val newRep = Report_Adhoc
      val d = newRep.tab.toDF()
      val op = d.coalesce(1).write.mode("overwrite").format("csv").save("report")


println(s"pipeline complete in [${timer.elapsedTime()}]")

}

主类是这个^但是这会抛出零点异常

object Report_Adhoc extends App with TransientLogger{

    // code not including too verbose

val tab = 
  counts
   .filter(c => c._1.id.nonEmpty && c._2.id.nonEmpty)
  .map(c => (c._1, c._2, c._3, c._3.values.sum))
  .sort($"_4".desc)
  .map(count =>
    row(
      count._1.id, count._1.label,
      count._2.id, count._2.label,
      count._3(CITE), count._3(CROSS), count._3(MANUAL),
      count._3(RECIPROCAL), count._3(TRANSITIVE), count._3(FAMILY),
      count._4
    )

  )

  val df = tab.toDF()

  val op = df.coalesce(1).write.mode("overwrite").format("csv").save("report")

}

1 个答案:

答案 0 :(得分:-1)

看看这个:

https://www.scala-lang.org/api/2.12.3/scala/DelayedInit.html

https://www.scala-lang.org/api/2.12.3/scala/App.html

另外,不要在名称中加下下划线。不要重用主类/对象。最后但并非最不重要的是,做一个最小的再现,而不是粘贴像这样的大块代码。