我正在使用DataFrame如下所示在spark.sql中运行蜂巢查询
DF1=spark.sql(""" select .........""")
import sys
from pyspark.sql import SparkSession
from pyspark.sql import DataFrame
spark = SparkSession\
.builder\
.master("yarn")\
.appName("03_Pull_ILS_landing_attach_RETL_A.") \
.enableHiveSupport()\
.getOrCreate()
DF_01=spark.sql("""
select
column1,
column2,
column3,
where condition
"""
)
当我运行.py
时spark-submit \
--conf "spark.dynamicAllocation.enabled=false" \
--master yarn \
--deploy-mode cluster \
--driver-memory 1g \
--num-executors 40 \
--executor-cores 4 \
--executor-memory 26g \
--queue queuename \
pythonfile.py
我总是出错
Log Length: 3249
Traceback (most recent call last):
File "pythonfile.py", line 36, in <module>
"""
以““”结尾出现错误的地方。
如何在DF中提及我的配置单元查询?
答案 0 :(得分:0)
删除,在哪里添加表名,鲍勃是你的叔叔。还是其他?