Question

我有两个要使用并集合并的数据框。执行合并后，使用df.show（）打印最终数据框，显示记录按预期顺序排列（第一个数据框记录在顶部，然后是第二个数据框记录）。但是，当我将此最终数据帧写入csv文件时，我想放在csv文件顶部的第一个数据帧中的记录丢失了位置。第一个数据框的记录与第二个数据框的记录混合在一起。任何帮助将不胜感激。

下面是一个代码示例：

val intVar = 1

val myList = List(("hello",intVar))

val firstDf = myList.toDF()

val secondDf: DataFrame = testRdd.toDF()

val finalDF = firstDf.union(secondDf)

finalDF.show() // prints the dataframe with firstDf records on the top followed by the secondDf records

val outputfilePath = "/home/out.csv"

finalDF.coalesce(1).write.csv(outputFilePath) //the first Df records are getting mixed with the second Df records.

Scala Spark：将DataFrame写入CSV文件时，顺序更改

0 个答案: