Question

 for (e <- arr02) {
      val df = t04.select("session_id", e)  // right
      val w = Window.partitionBy($"session_id").orderBy($e.desc)  //error
}

e是字符串变量，方法.orderBy($e.desc) e错误，.orderBy($"column_name".desc)是对的。

那么如何用orderBy？

中的变量表示列名

Answer 1

根据您的情况，您可以使用sql.functions.col：

import org.apache.spark.sql.functions.col

val w = Window.partitionBy($"session_id").orderBy(col(e).desc)

示例：的

val df = Seq(("a",2),("b",4)).toDF("A", "B")    

import org.apache.spark.sql.functions.col

df.orderBy($"A".desc).show
+---+---+
|  A|  B|
+---+---+
|  b|  4|
|  a|  2|
+---+---+

使用变量作为列名：

val e = "A"
df.orderBy($e.desc).show

<console>:27: error: not found: value $e
       df.orderBy($e.desc).show
                  ^

使用col从字符串构建列：

df.orderBy(col(e).desc).show
+---+---+
|  A|  B|
+---+---+
|  b|  4|
|  a|  2|
+---+---+

如何用scala数据帧的orderBy表示变量的列名？

1 个答案: