使用Spark SQL查询Hive

时间:2019-05-10 00:49:24

标签: apache-spark hive pyspark jupyter-notebook livy

当我尝试以下查询时,我收到以下错误。我该如何解决?

from pyspark.sql import SparkSession
spark = SparkSession.builder.enableHiveSupport().getOrCreate()
spark.sql("show databases").show()
  

java.io.FileNotFoundException:

     

来源   '/var/lib/livy/.ivy2/jars/org.apache.zookeeper_zookeeper-3.4.6.jar'   不存在

我正在使用具有以下组件的AWS EMR:

  

蜂巢2.3.4,猪0.17.0,JupyterHub 0.9.4,神经节3.7.2,星火2.4.0,   HBase 1.4.9

spark-defaults.conf具有以下相关配置:

hive.metastore.uris              thrift://<node>:9083
spark.sql.broadcastTimeout       300
spark.sql.catalogImplementation  hive
spark.sql.warehouse.dir          hdfs:///user/spark/warehouse
spark.sql.hive.metastore.sharedPrefixes com.amazonaws.services.dynamodbv2
spark.sql.hive.metastore.jars    maven
spark.sql.hive.metastore.version 2.3

0 个答案:

没有答案