Spark WordCount的命令行编译中的错误

时间:2018-07-18 02:34:03

标签: scala apache-spark word-count

我正在尝试使用命令行编译并运行Scala的WordCOunt程序,而没有任何Maven和sbt支持。 我用来编译scala程序的命令是

scalac -classpath /spark-2.3.0-bin-hadoop2.7/jars/ Wordcount.scala

import org.apache.spark._
import org.apache.spark.SparkConf

/** Create a RDD of lines from a text file, and keep count of
 *  how often each word appears.
 */
object wordcount {

  def main(args: Array[String]) {
      // Set up a SparkContext named WordCount that runs locally using
      // all available cores.
      val conf = new SparkConf().setAppName("WordCount")
      conf.setMaster("local[*]")
      val sc = new SparkContext(conf)

我的研究: 我参考了源代码,发现import语句位于 他们需要的罐子。
例如 SparkConf位于软件包org.apache.spark中 该程序中提到了。

https://github.com/apache/spark/blob/v2.3.1/core/src/main/scala/org/apache/spark/SparkConf.scala

我遇到的错误:

  

Wordcount.scala:3: error: **object apache is not a member of package org import org.apache.spark._ ^**

     

Wordcount.scala:4: error: **object apache is not a member of package org import org.apache.spark.SparkConf** ^

     

Wordcount.scala:14: error: not found: **type SparkConf val conf = new SparkConf().setAppName("WordCount")** ^

     

Wordcount.scala:16: error: not found: **type SparkContext val sc = new SparkContext(conf)** ^

发现四个错误

1 个答案:

答案 0 :(得分:2)

尝试一下:

scalac -classpath "/spark-2.3.0-bin-hadoop2.7/jars/*" Wordcount.scala

您的问题中提到的 scalac 命令存在问题。如果要从某个目录中选择所有jar并将其放在classpath中,则需要使用 *通配符并将路径包装在双引号中。

请参考:Including all the jars in a directory within the Java classpath以获取详细信息