找不到具有org.apache.spark.sql.Row类的jar
我打开了jar文件spark-sql_2.11-2.4.3.jar,但是没有org.apache.spark.sql.Row类。 但是Spark中的文档说它应该在那里。 https://spark.apache.org/docs/2.1.1/api/java/org/apache/spark/sql/Row.html
import org.apache.spark.sql.SparkSession
import com.microsoft.azure.sqldb.spark.config.Config
import com.microsoft.azure.sqldb.spark.connect._
object BulkCopy extends App{
val spark = SparkSession
.builder()
.appName("Spark SQL data sources example")
.config("spark.some.config.option", "some-value")
.getOrCreate()
var df = spark.read.parquet("parquet")
val bulkCopyConfig = com.microsoft.azure.sqldb.spark.config.Config(Map(
"url" -> jdbcHostname,
"databaseName" -> jdbcDatabase,
"user" -> jdbcUsername,
"password" -> jdbcPassword,
"dbTable" -> "dbo.RAWLOG_3_1_TEST1",
"bulkCopyBatchSize" -> "2500",
"bulkCopyTableLock" -> "true",
"bulkCopyTimeout" -> "600"
))
df.bulkCopyToSqlDB(bulkCopyConfig)
Error:(17, 13) Symbol 'type org.apache.spark.sql.Row' is missing from the classpath.
This symbol is required by 'type org.apache.spark.sql.DataFrame'.
Make sure that type Row is in your classpath and check for conflicting dependencies with `-Ylog-classpath`.
A full rebuild may help if 'package.class' was compiled against an incompatible version of org.apache.spark.sql.
var df = spark.read.parquet("parquet")
答案 0 :(得分:0)
org.apache.spark.sql.Row
类不是jar文件spark-sql_2.11-2.4.3.jar
的一部分。相反,您可以在spark-catalyst_2.11-2.4.3.jar
中找到它。以下spark sql库依赖项取决于spark-catalyst lib,并且您的构建工具(maven / sbt)应该能够为您自动解决该问题
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.4.3</version>
</dependency>
OR
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.4.3"
答案 1 :(得分:0)