无法在本地创建表,需要获得Hive支持

时间:2019-08-12 14:05:48

标签: scala apache-spark hive scalatest

即使在设置配置后也会出错

config("spark.sql.catalogImplementation","hive")


override def beforeAll(): Unit = {
  super[SharedSparkContext].beforeAll()
  SparkSessionProvider._sparkSession = SparkSession.builder().master("local[*]").config("spark.sql.catalogImplementation","hive").getOrCreate()
}

编辑:

这是设置本地数据库和表进行测试的方式。

val stgDb = "test_stagingDB"
val stgTbl_exp ="test_stagingDB_expected"
val stgTbl_result="test_stg_table_result"

val trgtDb = "test_activeDB"
val trgtTbl_exp ="test_activeDB_expected"
val trgtTbl_result ="test_activeDB_results"

def setUpDb ={
  println("Set up DB started")
  val localPath="file:/C:/Users/vmurthyms/Code-prdb/prdb/com.rxcorp.prdb"
  spark.sql(s"CREATE DATABASE IF NOT EXISTS test_stagingDB LOCATION '$localPath/test_stagingDB.db'")
  spark.sql(s"CREATE DATABASE IF NOT EXISTS test_activeDB LOCATION '$localPath/test_sctiveDB.db'")
  spark.sql(s"CREATE TABLE IF NOT EXISTS $trgtDb.${trgtTbl_exp}_ina (Id String, Name String)")
  println("Set up DB done")
}
setUpDb

运行spark.sql(“ CREATE TABLE ..,”)cmd时,出现以下错误: 错误: 需要Hive支持才能创建Hive TABLE(AS SELECT); 'CreateTable test_activeDBtest_activeDB_expected_ina,org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,忽略

org.apache.spark.sql.AnalysisException:需要Hive支持才能创建Hive TABLE(AS SELECT); 'CreateTable test_activeDBtest_activeDB_expected_ina,org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,忽略

at org.apache.spark.sql.execution.datasources.HiveOnlyCheck$$anonfun$apply$12.apply(rules.scala:392)
at org.apache.spark.sql.execution.datasources.HiveOnlyCheck$$anonfun$apply$12.apply(rules.scala:390)
at org.apache.spark.sql.catalyst.trees.TreeNode.foreach(TreeNode.scala:117)
at org.apache.spark.sql.execution.datasources.HiveOnlyCheck$.apply(rules.scala:390)
at org.apache.spark.sql.execution.datasources.HiveOnlyCheck$.apply(rules.scala:388)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$2.apply(CheckAnalysis.scala:349)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$2.apply(CheckAnalysis.scala:349)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:349)
at org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:92)
at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:105)
at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:57)
at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:55)
at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:47)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:74)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:641)
at com.rxcorp.prdb.exe.SitecoreAPIExtractTest$$anonfun$2.setUpDb$1(SitecoreAPIExtractTest.scala:127)
at com.rxcorp.prdb.exe.SitecoreAPIExtractTest$$anonfun$2.apply$mcV$sp(SitecoreAPIExtractTest.scala:130)

1 个答案:

答案 0 :(得分:1)

似乎您快到了(您的错误消息也为您提供了线索),在创建Spark会话时需要调用enableHiveSupport()。例如

SparkSession.builder()
         .master("local[*]")
         .config("spark.sql.catalogImplementation","hive")
         .enableHiveSupport()
         .getOrCreate()

而且在使用enableHiveSupport()时,设置config("spark.sql.catalogImplementation","hive")看起来也是多余的。我认为您可以放心地删除该部分。