什么是正确的scala版本用于火花?

时间:2017-06-05 07:19:42

标签: scala maven apache-spark apache-spark-sql

我对我应该使用哪个scala感到困惑。运行spark-submit应用程序时出现此错误:

17/06/05 06:59:46 ERROR yarn.ApplicationMaster: User class threw 
exception: java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaMirrors$JavaMirror;
java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaMirrors$JavaMirror;
    at com.xxx.push_up.App$.main(App.scala:255)
    at com.xxx.push_up.App.main(App.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)

我发现这是因为scala在编译和执行之间不兼容。我使用此代码检查运行时scala版本:

println("SparkContext version: "+ sc.version)
println("Scala version: "+ scala.tools.nsc.Properties.versionString)

输出结果为:

SparkContext version: 2.1.1
Scala version: version 2.11.8

我的pom.xml是:

...
    <build>
        <plugins>
            <plugin>
                <groupId>net.alchim31.maven</groupId>
                <artifactId>scala-maven-plugin</artifactId>
                <version>3.1.6</version>
                <configuration>
                    <scalaCompatVersion>2.11.8</scalaCompatVersion>
                    <scalaVersion>2.11.8</scalaVersion>
                </configuration>
                <executions>
                    <execution>
                        <phase>compile</phase>
                        <goals>
                            <goal>compile</goal>
                            <goal>testCompile</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.3</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>
        </plugins>
    </build>
    <dependencies>
        <dependency> <!-- Spark dependency -->
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.11</artifactId>
            <version>2.1.1</version>
            <scope>provided</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.11</artifactId>
            <version>2.1.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-hive_2.11</artifactId>
            <version>2.1.1</version>
        </dependency>
    </dependences>
....

我不知道出了什么问题。感谢。

App.scala中的错误代码:255:

val gcm_log_df = sqlContext.createDataFrame(gcm_log_raw_rdd.filter(_.length == 12), gcm_log_raw_schema).filter("pid != 'unknown'").select("pid","channel")

这是我第一次使用sqlContext,我认为它解决了这个问题。

我添加了scala依赖,仍然是同样的问题:

<dependency>
   <groupId>org.scala-lang</groupId>
   <artifactId>scala-library</artifactId>            
   <version>2.11.8</version>
</dependency>

0 个答案:

没有答案