首先让我们创建一个hive-enabled
火花会话:
val spark = SparkSession.builder.config(conf).enableHiveSupport.getOrCreate
然后让我们尝试连接到远程dB:
spark.sql("use my_remote_db").show
17/12/10 10:27:02 WARN ObjectStore: Failed to get database my_remote_db, returning NoSuchObjectException
Exception in thread "main" org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'my_remote_db' not found;
at org.apache.spark.sql.catalyst.catalog.SessionCatalog.org$apache$spark$sql$catalyst$catalog$SessionCatalog$$requireDbExists(SessionCatalog.scala:173)
at org.apache.spark.sql.catalyst.catalog.SessionCatalog.setCurrentDatabase(SessionCatalog.scala:268)
at org.apache.spark.sql.execution.command.SetDatabaseCommand.run(databases.scala:59)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:182)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:67)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:623)
at com.mycompany.sbseg.graph.poc.features.LoadGraphData$.loadGraphData(LoadGraphData.scala:8)
注意:相同的代码 适用于spark-shell 。
以下是Intellij
中用于模拟用于运行bash
的{{1}} shell环境的其他设置:
为了确保它们设置正确,它们会打印出来:
spark-shell
这些打印出与命令行相同/正确的结果。
Seq("SPARK_HOME","HIVE_CONF_DIR","HIVE_HOME")
.foreach{ s=>println(s"$s:${System.getenv(s)}")}
因此,不清楚Intellij和bash环境之间可能存在的差异 - 以及代码在前者中无法正常工作的原因。