如何使reflectMethod作为UDF

时间:2018-11-21 01:51:49

标签: scala apache-spark

对我来说是个问题:( Coule人帮我一些忙:)

这是我的要求:动态加载jar +注册UDF。

代码:

 //Test.ImTest
 object ImTest extends Serializable {
    def len(bookTitle: String):String =
          {"ImTest"}
      }

    // main 
    val ru = scala.reflect.runtime.universe
    val classMirror = ru.runtimeMirror(getClass.getClassLoader)
    val classTest = classMirror.staticModule("Test.ImTest")
    val methods = classMirror.reflectModule(classTest)
    val objMirror = classMirror.reflect(methods.instance)

    val method = methods.symbol.typeSignature.member(ru.TermName("len")).asMethod

    val result = objMirror.reflectMethod(method)("bbb")

    def d(s: String) = {
      objMirror.reflectMethod(method)(s)
    }
    spark.udf.register("len", d _)
    spark.sql("select len('bb')").show()

错误:

18/11/21 09:58:52 INFO execution.SparkSqlParser: Parsing command: select len('bb')
18/11/21 09:58:54 INFO codegen.CodeGenerator: Code generated in 475.734954 ms
Exception in thread "main" org.apache.spark.SparkException: Task not serializable
    at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298)
    at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288)
    at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108)
    at org.apache.spark.SparkContext.clean(SparkContext.scala:2106)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1.apply(RDD.scala:840)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1.apply(RDD.scala:839)
  

由于:java.io.NotSerializableException:   scala.reflect.runtime.SynchronizedSymbols $ SynchronizedSymbol $$ anon $ 9   序列化堆栈:     -无法序列化的对象(类:scala.reflect.runtime.SynchronizedSymbols $ SynchronizedSymbol $$ anon $ 9,   值:方法len)     -字段(类:Test.Main $$ anonfun $ main $ 1,名称:method $ 1,类型:interface scala.reflect.api.Symbols $ MethodSymbolApi)     -对象(类Test.Main $$ anonfun $ main $ 1,)     -字段(类:org.apache.spark.sql.catalyst.expressions.ScalaUDF $$ anonfun $ 2,名称:   func $ 2,类型:interface scala.Function1)     -对象(类org.apache.spark.sql.catalyst.expressions.ScalaUDF $$ anonfun $ 2,   )     -字段(类:org.apache.spark.sql.catalyst.expressions.ScalaUDF,名称:f,类型:interface scala.Function1)     -对象(org.apache.spark.sql.catalyst.expressions.ScalaUDF类,UDF:len(bb))     -数组元素(索引:0)     -数组(类[Ljava.lang.Object ;,大小2)     -字段(类:org.apache.spark.sql.execution.WholeStageCodegenExec $$ anonfun $ 8,名称:   references $ 1,类型:class [Ljava.lang.Object;)     -对象(类org.apache.spark.sql.execution.WholeStageCodegenExec $$ anonfun $ 8,   ) 在   org.apache.spark.serializer.SerializationDebugger $ .improveException(SerializationDebugger.scala:40)     在   org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46)     在   org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)     在   org.apache.spark.util.ClosureCleaner $ .ensureSerializable(ClosureCleaner.scala:295)     ...另外44个

0 个答案:

没有答案