Spark Scala:无法导入sqlContext.implicits._

时间:2016-01-18 12:01:41

标签: scala maven apache-spark apache-spark-sql

我尝试了下面的代码并且无法导入sqlContext.implicits._ - 它会抛出错误(在Scala IDE中),无法构建代码:

  

value implicits不是org.apache.spark.sql.SQLContext的成员

我是否需要在pom.xml

中添加任何依赖项

Spark版本1.5.2

package com.Spark.ConnectToHadoop

import org.apache.spark.SparkConf
import org.apache.spark.SparkConf
import org.apache.spark._
import org.apache.spark.sql._
import org.apache.spark.sql.hive.HiveContext
import org.apache.spark.sql.SQLContext
import org.apache.spark.rdd.RDD
//import groovy.sql.Sql.CreateStatementCommand

//import org.apache.spark.SparkConf


object CountWords  {

  def main(args:Array[String]){

    val objConf = new SparkConf().setAppName("Spark Connection").setMaster("spark://IP:7077")
    var sc = new SparkContext(objConf)
val objHiveContext = new HiveContext(sc)
objHiveContext.sql("USE test")
var rdd= objHiveContext.sql("select * from Table1")
val options=Map("path" -> "hdfs://URL/apps/hive/warehouse/test.db/TableName")
//val sqlContext = new org.apache.spark.sql.SQLContext(sc)
   val sqlContext = new SQLContext(sc)
    import sqlContext.implicits._      //Error
val dataframe = rdd.toDF()
dataframe.write.format("orc").options(options).mode(SaveMode.Overwrite).saveAsTable("TableName")      
  }
}

我的pom.xml文件如下

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>com.Sudhir.Maven1</groupId>
  <artifactId>SparkDemo</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  <packaging>jar</packaging>

  <name>SparkDemo</name>
  <url>http://maven.apache.org</url>

  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  </properties>

  <dependencies>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-core_2.10</artifactId>
      <version>1.5.2</version>
    </dependency> 
    <dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-sql_2.10</artifactId>
    <version>1.0.0</version>
</dependency>
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-mllib_2.10</artifactId>
    <version>1.5.2</version>
</dependency>

<dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming_2.10</artifactId>
      <version>0.9.1</version>
    </dependency>

     <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-hive_2.10</artifactId>
        <version>1.2.1</version>
    </dependency>
  <dependency>
    <groupId>org.apache.hive</groupId>
    <artifactId>hive-jdbc</artifactId>
    <version>1.2.1</version>
</dependency>

    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>     

  </dependencies>
</project>

5 个答案:

答案 0 :(得分:5)

首先创建

val sqlContext = new org.apache.spark.sql.SQLContext(sc)

现在我们有sqlContext w.r.t sc(这将在我们启动spark-shell时自动提供) 现在,

import sqlContext.implicits._ 

答案 1 :(得分:4)

随着Spark 2.0.0(2016年7月26日)的发布,现在应该使用以下内容:

import spark.implicits._  // spark = SparkSession.builder().getOrCreate()

https://databricks.com/blog/2016/08/15/how-to-use-sparksession-in-apache-spark-2-0.html

答案 2 :(得分:2)

您使用旧版本的Spark-SQL。将其更改为:

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-sql_2.10</artifactId>
    <version>1.5.2</version>
</dependency>

答案 3 :(得分:0)

您也可以使用

    public static void SetIfSameType<K>(ref K toSet, object val)
        where K : class
    {
        if (val.GetType() == typeof(K))
            toSet = val as K;
    }

    SetIfSameType(ref parsedModel.MSH, header);
    SetIfSameType(ref parsedModel.FAC, header);
    SetIfSameType(ref parsedModel.PRD, header);
    SetIfSameType(ref parsedModel.PID, header);

答案 4 :(得分:0)

对于使用sbt进行构建的人,请将库版本更新为

libraryDependencies ++= Seq(
  "org.apache.spark" % "spark-core_2.12" % "2.4.6" % "provided",
  "org.apache.spark" % "spark-sql_2.12" % "2.4.6" % "provided"
)

然后按如下所示导入SqlImplicits。

val spark = SparkSession.builder()
      .appName("appName")
      .getOrCreate()

    import spark.sqlContext.implicits._;