我正在尝试使用以下代码在Spark 2.2中创建UDF:
spark.udf.register(
"DAYOFWEEK",
(timestamp: java.sql.Timestamp) => {
new Timestamp()
val cal = Calendar.getInstance()
cal.setTime(timestamp)
cal.get(Calendar.DAY_OF_WEEK)
}
稍后,当启动下一个SQL查询时:
SELECT DAYOFWEEK(now())
出现下一个异常:
cannot resolve 'UDF:DAYOFWEEK(current_timestamp())' due to data type mismatch: argument 1 requires bigint type, however, 'current_timestamp()' is of timestamp type.; line 1 pos 7;
我在做什么错了?
答案 0 :(得分:0)
scala> sqlContext.udf.register(
| "DAYOFWEEK",
| (timestamp: java.sql.Timestamp) => {
| val cal = Calendar.getInstance()
| cal.setTime(timestamp)
| cal.get(Calendar.DAY_OF_WEEK)
| });
res16: org.apache.spark.sql.UserDefinedFunction = UserDefinedFunction(<function1>,IntegerType,List(TimestampType))
scala>
scala> val dd = sqlContext.sql("select DAYOFWEEK(now())")
dd: org.apache.spark.sql.DataFrame = [_c0: int]
scala> dd.show
+---+
|_c0|
+---+
| 4|
+---+
答案 1 :(得分:0)
@Constantine感谢您的建议。问题在于,已经有一个UDF注册了相同的名称,但以日期作为参数:
udf.register(
"DAYOFWEEK",
(date: Date) => {
val cal = Calendar.getInstance()
cal.setTime(date)
cal.get(Calendar.DAY_OF_WEEK)
}
)
一旦会话仅注册了一个 DAYOFWEEK UDF,它就会按预期工作