我正在尝试执行以下操作,但是在使用groupByKey转换时遇到错误。我在独立模式下使用Spark。
sample.sbt 包含:
name := "Spark Join"
version := "1.0"
scalaVersion := "2.10.4"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.3.0"
fork := true
我的Scala代码
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import java.util.Properties
object yelpDataJoin {
def main(args: Array[String]) {
val reviewFile = " /home/prasad/Desktop/BigData/psp150030_HW3/data/review3.csv"
val conf = new SparkConf().setAppName("SparkJoins")
val sc = new SparkContext(conf)
val reviewData = sc.textFile(reviewFile, 2)
val groupReviewData = reviewData.map(line => line.split("::")).map(word => (word(2),(word(20),1))).groupByKey().foreach(println)
}
}
我收到以下错误消息:
15/07/20 16:10:48 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 32.0 KB, free 265.4 MB)
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/rdd/RDD$
at yelpDataJoin$.main(HW3_Question2.scala:14)
at yelpDataJoin.main(HW3_Question2.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:328)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.rdd.RDD$
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 9 more
如果我在这里做错了,请告诉我。
谢谢&的问候,
普拉萨德