找不到缺少课程的罐子

时间:2019-07-03 06:00:05

标签: apache-spark

找不到具有org.apache.spark.sql.Row类的jar

我打开了jar文件spark-sql_2.11-2.4.3.jar,但是没有org.apache.spark.sql.Row类。 但是Spark中的文档说它应该在那里。 https://spark.apache.org/docs/2.1.1/api/java/org/apache/spark/sql/Row.html

import org.apache.spark.sql.SparkSession
import com.microsoft.azure.sqldb.spark.config.Config
import com.microsoft.azure.sqldb.spark.connect._

object BulkCopy extends App{
  val spark = SparkSession
    .builder()
    .appName("Spark SQL data sources example")
    .config("spark.some.config.option", "some-value")
    .getOrCreate()
  var df = spark.read.parquet("parquet")

  val bulkCopyConfig = com.microsoft.azure.sqldb.spark.config.Config(Map(
    "url"            -> jdbcHostname,
    "databaseName"   -> jdbcDatabase,
    "user"           -> jdbcUsername,
    "password"       -> jdbcPassword,
    "dbTable"        -> "dbo.RAWLOG_3_1_TEST1",
    "bulkCopyBatchSize" -> "2500",
    "bulkCopyTableLock" -> "true",
    "bulkCopyTimeout"   -> "600"
  ))

  df.bulkCopyToSqlDB(bulkCopyConfig)

Error:(17, 13) Symbol 'type org.apache.spark.sql.Row' is missing from the classpath.
This symbol is required by 'type org.apache.spark.sql.DataFrame'.
Make sure that type Row is in your classpath and check for conflicting dependencies with `-Ylog-classpath`.
A full rebuild may help if 'package.class' was compiled against an incompatible version of org.apache.spark.sql.
   var df = spark.read.parquet("parquet")

2 个答案:

答案 0 :(得分:0)

org.apache.spark.sql.Row类不是jar文件spark-sql_2.11-2.4.3.jar的一部分。相反,您可以在spark-catalyst_2.11-2.4.3.jar中找到它。以下spark sql库依赖项取决于spark-catalyst lib,并且您的构建工具(maven / sbt)应该能够为您自动解决该问题

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-sql_2.11</artifactId>
    <version>2.4.3</version>
</dependency>

OR

libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.4.3"

这是spar-sql lib的依赖项: enter image description here

答案 1 :(得分:0)