我已经将数据框转换为RDD:
val rows: RDD[Row] = df.orderBy($"Date").rdd
现在我正试图将其转换回去:
val df2 = spark.createDataFrame(rows)
但是我遇到一个错误:
编辑:
rows.toDF()
还会产生错误:
无法将符号解析为DF
即使我之前包括了这一行:
import spark.implicits._
完整代码:
import org.apache.spark._
import org.apache.spark.sql._
import org.apache.spark.sql.expressions._
import org.apache.spark.sql.functions._
import org.apache.spark.sql.types._
import scala.util._
import org.apache.spark.mllib.rdd.RDDFunctions._
import org.apache.spark.rdd._
object Playground {
def main(args: Array[String]): Unit = {
val spark = SparkSession
.builder
.appName("Playground")
.config("spark.master", "local")
.getOrCreate()
import spark.implicits._
val sc = spark.sparkContext
val df = spark.read.csv("D:/playground/mre.csv")
df.show()
val rows: RDD[Row] = df.orderBy($"Date").rdd
val df2 = spark.createDataFrame(rows)
rows.toDF()
}
}
答案 0 :(得分:2)
您的IDE是正确的,SparkSession.createDataFrame需要第二个参数:bean类或模式。
这将解决您的问题:
val df2 = spark.createDataFrame(rows, df.schema)