尝试使用hive从spark创建python中的表失败 使用ParseException
Cannot create hive serde table
在 Hortonworks HDP 2.6
上运行代码是
warehouse_location = abspath('spark-warehouse')
spark = SparkSession \
.builder \
.appName("Python Spark SQL Hive integration example") \
.config("spark.sql.warehouse.dir", warehouse_location) \
.enableHiveSupport() \
.getOrCreate()
# spark is an existing SparkSession
spark.sql("CREATE TABLE IF NOT EXISTS tom (key INT, value STRING) USING hive")
生成错误:
INFO SparkSqlParser: Parsing command: CREATE TABLE IF NOT EXISTS tom (key INT, value STRING) USING hive
Traceback (most recent call last):
File "/usr/repos/dataconnect/model/create_model.py", line 17, in <module>
spark.sql("CREATE TABLE IF NOT EXISTS tom (key INT, value STRING) USING hive")
File "/usr/hdp/2.6.1.0-129/spark2/python/lib/pyspark.zip/pyspark/sql/session.py", line 545, in sql
File "/usr/hdp/2.6.1.0-129/spark2/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
File "/usr/hdp/2.6.1.0-129/spark2/python/lib/pyspark.zip/pyspark/sql/utils.py", line 73, in deco
pyspark.sql.utils.ParseException: u'\nCannot create hive serde table with CREATE TABLE USING\n== SQL ==\nCREATE TABLE IF NOT EXISTS tom (key INT, value STRING) USING hive'
答案 0 :(得分:0)
scala> hiveContext.sql("CREATE TABLE IF NOT EXISTS tom (key INT, value STRING) row format delimited fields terminated by ','")
res157: org.apache.spark.sql.DataFrame = []
scala> hiveContext.sql("select * from tom");
res158: org.apache.spark.sql.DataFrame = [key: int, value: string]
scala> hiveContext.sql("select * from tom").show()
+---+-----+
|key|value|
+---+-----+
答案 1 :(得分:0)
只需删除“使用中的配置单元”。因此,命令变为-
spark.sql("CREATE TABLE IF NOT EXISTS spark_hive_table (key INT, value STRING)")
此命令在Hive中创建表。 (我将表名称更改为spark_hive_table,可以使用自己的名称)