Spark / Python - 使用hive创建表失败并出现ParseException

时间:2017-11-28 14:15:57

标签: apache-spark hive

尝试使用hive从spark创建python中的表失败 使用ParseException

 Cannot create hive serde table

Hortonworks HDP 2.6

上运行

代码是

warehouse_location = abspath('spark-warehouse') 

spark = SparkSession \
    .builder \
    .appName("Python Spark SQL Hive integration example") \
    .config("spark.sql.warehouse.dir", warehouse_location) \
    .enableHiveSupport() \
    .getOrCreate()

# spark is an existing SparkSession
spark.sql("CREATE TABLE IF NOT EXISTS tom (key INT, value STRING) USING hive")

生成错误:

 INFO SparkSqlParser: Parsing command: CREATE TABLE IF NOT EXISTS tom (key INT, value STRING) USING hive
Traceback (most recent call last):
  File "/usr/repos/dataconnect/model/create_model.py", line 17, in <module>
    spark.sql("CREATE TABLE IF NOT EXISTS tom (key INT, value STRING) USING hive")
  File "/usr/hdp/2.6.1.0-129/spark2/python/lib/pyspark.zip/pyspark/sql/session.py", line 545, in sql
  File "/usr/hdp/2.6.1.0-129/spark2/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
  File "/usr/hdp/2.6.1.0-129/spark2/python/lib/pyspark.zip/pyspark/sql/utils.py", line 73, in deco
pyspark.sql.utils.ParseException: u'\nCannot create hive serde table with CREATE TABLE USING\n== SQL ==\nCREATE TABLE IF NOT EXISTS tom (key INT, value STRING) USING hive'

2 个答案:

答案 0 :(得分:0)

scala> hiveContext.sql("CREATE TABLE IF NOT EXISTS tom (key INT, value STRING) row format delimited fields terminated by ','")

res157: org.apache.spark.sql.DataFrame = []


scala> hiveContext.sql("select * from tom");

res158: org.apache.spark.sql.DataFrame = [key: int, value: string]

scala> hiveContext.sql("select * from tom").show()
+---+-----+
|key|value|
+---+-----+

答案 1 :(得分:0)

只需删除“使用中的配置单元”。因此,命令变为-

spark.sql("CREATE TABLE IF NOT EXISTS spark_hive_table (key INT, value STRING)")

此命令在Hive中创建表。 (我将表名称更改为spark_hive_table,可以使用自己的名称)