Pyspark IndentationError:应缩进的块

时间:2018-08-05 08:24:35

标签: pyspark-sql

我正在尝试使用pyspark命令saveAsTextFile将pyspark.sql的查询结果发送到文本文件

我尝试了以下代码来实现这一目标:

def main():
#Order by sales descending
example8 = spark.sql("""SELECT
    *
FROM sales_info
ORDER BY Sales DESC""")
print.example8.collect()
example8.saveAsTextFile("juyfd")

main()

但是,出现以下错误:

Append ResultsClear Results
  File "<ipython-input-25-ce236630f96f>", line 3
    example8 = spark.sql("""SELECT
           ^
IndentationError: expected an indented block

缩进example8时出现以下错误:

Append ResultsClear Results
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-26-337d9fc099ad> in <module>()
      5 FROM sales_info
      6 ORDER BY Sales DESC""")
----> 7 print.example8.collect()
      8 example8.saveAsTextFile("juyfd")
      9 

AttributeError: 'builtin_function_or_method' object has no attribute 'example8'

1 个答案:

答案 0 :(得分:0)

出现此错误的原因是,应缩进 main 函数中的指令:

def main():
    #Order by sales descending
    example8 = spark.sql("""SELECT
    *
    FROM sales_info
    ORDER BY Sales DESC""")
    print.example8.collect()
    # Use pyspark.sql.DataFrameWriter to save as csv
    # No argument SaveAsTextFile for a DataFrame

    example8.write.csv("path_in_hdfs.csv",sep=';')

main()

您也根本无法使用main函数,而只是运行功能中显示的命令。

请注意,您精确确定的路径应该是hdfs中的路径。您还可以为csv输出精确指定分隔符值。有关更多信息:DataFrameWriter documentation