我正在尝试使用Spark中的scala执行简单的wordcount。但是我得到了这两个错误。我对scala比较陌生,我无法弄清楚。
error: ')' expected but '(' found.
println("sparkcontext created")
^
error: ';' expected but 'val' found.
val lines = sc.textFile(inputFile)
^
我想要运行的代码是,
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
object SparkWordCount
{
def main(args: Array[String])
{
//checking to see if the user has entered all the required args
if (args.length < 2)
{
System.err.println("Usage: <Input_File> <Output_File>")
System.exit(1)
}
val inputFile = args(0)
val outputFile = args(1)
// Here we create a new SparkContext instance called sc.
val sc = new SparkContext(spark://hdn1001.local:7077, "Scala Word Count",System.getenv("SPARK_HOME"), SparkContext.jarOfClass(this.getClass))
println("sparkcontext created")
// Input File
println("Parsing InputFile")
val lines = sc.textFile(inputFile)
println("Parsing InputFile completed")
// split each document into words
val words = lines.flatMap(line => line.split(" "))
println("Split each line of the InputFile into words")
//Count the occurrence of each word
val result = words.map(word => (word, 1))
.reduceByKey((x,y) => x+y)
println("WordCount completed")
// save the output to a file
result.saveAsTextFile(outputFile)
}
}
我无法弄明白。我确实检查了几个文档的scala语法,但不明白那里的问题。
感谢balaji,我确实解决了这个问题。
答案 0 :(得分:1)
只需将spark master URL包装在双引号中,就像它只是一个字符串一样。 因此它应该看起来像,
val sc = new SparkContext("spark://hdn1001.local:7077", "Scala Word Count",System.getenv("SPARK_HOME"), SparkContext.jarOfClass(this.getClass))