将行传递给UDF并根据模式匹配选择列

时间:2019-07-08 23:41:31

标签: apache-spark apache-spark-sql

如何通过将行传递到udf来实现以下目标?

val df1 = df.withColumn("col_Z", 
              when($"col_x" === "a", $"col_A")
              .when($"col_x" === "b", $"col_B")
              .when($"col_x" === "c", $"col_C")
              .when($"col_x" === "d", $"col_D")
              .when($"col_x" === "e", $"col_E")
              .when($"col_x" === "f", $"col_F")
              .when($"col_x" === "g", $"col_G")
      )

据我了解,只有列可以作为参数传递给Scala Spark中的UDF。

我看了这个问题:

How to pass whole Row to UDF - Spark DataFrame filter

并尝试实现此udf:

def myUDF(r:Row) = udf {

 val z : Float = r.getAs("col_x") match {
      case "a" => r.getAs("col_A")
      case "b" => r.getAs("col_B")
      case other => lit(0.0)
   }
 z
}

但是我收到类型不匹配错误:

 error: type mismatch;
 found   : String("a")
 required: Nothing
 case "a" => r.getAs("col_A")
      ^

我在做什么错了?

0 个答案:

没有答案