我正在尝试使用pyspark命令saveAsTextFile
将pyspark.sql的查询结果发送到文本文件
我尝试了以下代码来实现这一目标:
def main():
#Order by sales descending
example8 = spark.sql("""SELECT
*
FROM sales_info
ORDER BY Sales DESC""")
print.example8.collect()
example8.saveAsTextFile("juyfd")
main()
但是,出现以下错误:
Append ResultsClear Results
File "<ipython-input-25-ce236630f96f>", line 3
example8 = spark.sql("""SELECT
^
IndentationError: expected an indented block
缩进example8
时出现以下错误:
Append ResultsClear Results
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-26-337d9fc099ad> in <module>()
5 FROM sales_info
6 ORDER BY Sales DESC""")
----> 7 print.example8.collect()
8 example8.saveAsTextFile("juyfd")
9
AttributeError: 'builtin_function_or_method' object has no attribute 'example8'
答案 0 :(得分:0)
出现此错误的原因是,应缩进 main 函数中的指令:
def main():
#Order by sales descending
example8 = spark.sql("""SELECT
*
FROM sales_info
ORDER BY Sales DESC""")
print.example8.collect()
# Use pyspark.sql.DataFrameWriter to save as csv
# No argument SaveAsTextFile for a DataFrame
example8.write.csv("path_in_hdfs.csv",sep=';')
main()
您也根本无法使用main函数,而只是运行功能中显示的命令。
请注意,您精确确定的路径应该是hdfs中的路径。您还可以为csv输出精确指定分隔符值。有关更多信息:DataFrameWriter documentation