Seq [unit]类型的表达式不符合scala中预期的Seq [DataFrame]类型

时间:2016-11-29 07:14:03

标签: scala apache-spark dataframe

在我的函数中,我返回一个finalDF,一系列数据帧。在下面显示的循环中,map返回Seq[DataFrame]并且它存储在finalDF中以便能够返回到调用者,但在某些情况下,如果有进一步处理,我想存储已过滤每次迭代的数据帧并将其传递给下一个循环。

我该怎么办?如果我尝试将其分配给某个临时值,则会抛出错误,即Seq[unit]类型的表达式不符合预期类型Seq[DataFrame]

var finalDF: Seq[DataFrame] =null

    for (i <- 0 until stop){
  finalDF=strataCount(i).map(x=> {
    df.filter(df(cols(i)) === x)

    //how to get the above data frame to pass on to the next computation?
    }
  )

}

此致

1 个答案:

答案 0 :(得分:1)

也许这很有帮助:

val finalDF: Seq[DataFrame] = (0 to stop).flatMap(i => strataCount(i).map(x => df.filter(df(cols(i)) === x))).toSeq

flatMap展平Seq(Seq)

(0 to stop)将从0循环到stopflatMap将展平List,如:

scala> (0 to 20).flatMap(i => List(i))
res0: scala.collection.immutable.IndexedSeq[Int] = Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
scala> (0 to 20).map(i => List(i)).flatten
res1: scala.collection.immutable.IndexedSeq[Int] = Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)

对于两个计数器,也许你可以这样做:

(0 to stop).flatMap(j => {
 (0 to stop).flatMap(i => strataCount(i).map(x => df.filter(df(cols(i)) === x)))
}).toSeq

或尝试:for yield,请参阅:Scala for/yield syntax