Question

我正在使用DataFrame如下所示在spark.sql中运行蜂巢查询

DF1=spark.sql(""" select .........""")

import sys
from pyspark.sql import SparkSession
from pyspark.sql import DataFrame

spark = SparkSession\
    .builder\
    .master("yarn")\
    .appName("03_Pull_ILS_landing_attach_RETL_A.") \
    .enableHiveSupport()\
    .getOrCreate()

DF_01=spark.sql("""
select
  column1,
  column2,
  column3,
  where condition
"""
)

当我运行.py

时

spark-submit \
--conf "spark.dynamicAllocation.enabled=false" \
--master yarn \
--deploy-mode cluster \
--driver-memory 1g \
--num-executors 40 \
--executor-cores 4 \
--executor-memory 26g \
--queue queuename \
pythonfile.py

我总是出错

Log Length: 3249

Traceback (most recent call last):
  File "pythonfile.py", line 36, in <module>
    """

以““”结尾出现错误的地方。

如何在DF中提及我的配置单元查询？

Answer 1

删除，在哪里添加表名，鲍勃是你的叔叔。还是其他？

如何解决spark.sql中的python“”“代码问题

1 个答案: