如何通过Scala中的函数返回多个DataFrame?

时间:2018-03-17 18:03:34

标签: scala apache-spark spark-dataframe

我正在编写一个应该返回多个DataFrames的函数:

val df1, df2, df3 = getData(spark,df1,df2,df3)

def getData(spark: SparkSession, 
            path1: String, 
            path2: String,
            path3: String) : DataFrame = {

  val epoch = System.currentTimeMillis() / 1000

  val df1 = spark.read.parquet(path1)
  val df2 = spark.read.parquet(path2)
  val df3 = spark.read.parquet(path3)

  df1, df2, df3
}

但是,我收到了无法返回df1, df2, df3的编译错误。

1 个答案:

答案 0 :(得分:2)

您可以返回元组或数据框列表:

例如: 发送数据帧的元组

def getData(spark: SparkSession, 
            path1: String, 
            path2: String,
            path3: String) = {
//code
(df1, df2, df3)
}

发送数据帧列表

def getData(spark: SparkSession, 
                path1: String, 
                path2: String,
                path3: String) = {
    //code
    List(df1, df2, df3)
    }