我正在通过
阅读CSVdata=sc.textFile("filename")
Df = Sparksql.create dataframe()
Pdf = Df.toPandas ()
现在是Pdf分布在整个火花集群中还是它驻留在主机环境中?
答案 0 :(得分:0)
否。
正如在PySpark source code of DataFrame中所说:
.. note:: This method should only be used if the resulting Pandas's DataFrame is expected
to be small, as all the data is loaded into the driver's memory.