java.lang.RuntimeException:org.apache.spark.SparkException:任务无法在solr.DefaultSource.createRelation处序列化

时间:2018-12-21 13:11:21

标签: apache-spark

我已经看到许多此类有关序列化错误的帖子。但是我对此并不陌生。

有一个dataframe-modProductsData和一个映射L2L3Map Map。我想用map-L2L3Map的值替换PRIMARY_CATEGORY列中的值。

</activity>

下面是更多错误日志:

val L2L3Map = L2.collect.map(row => (row.get(0).toString, row.get(1).toString)).toMap
val L2L3MapUDF = udf { s: String => L2L3Map.get(s) }
val productsData = spark.read.format("solr").options(readFromSourceClusterOpts).load
var modProductsData = productsData.withColumn("Prime_L2_s", when(col("PRIMARY_CATEGORY").isNotNull, when(col("PRIMARY_CATEGORY").isin(L3ids:_*), L2L3MapUDF(col("PRIMARY_CATEGORY"))).otherwise(when(col("PRIMARY_CATEGORY").isin(L2ids:_*),col("PRIMARY_CATEGORY")).otherwise(lit(null)))).otherwise(lit(null)))

1 个答案:

答案 0 :(得分:0)

It worked with below code :)

def mapPhantom(flagMap: Map[String, String]): (String) => String = {
    (id: String) =>
      {
        flagMap.getOrElse(id,null)  
      }
  }

val L2L3Map = L2.collect.map(row => (row.get(0).toString, row.get(1).toString)).toMap
val L2L3UDF = udf(mapPhantom(L2L3Map))

var modProductsData = productsData.withColumn("Prime_L2_s", when(col("PRIMARY_CATEGORY").isNotNull, when(col("PRIMARY_CATEGORY").isin(L3ids:_*), L2L3UDF(col("PRIMARY_CATEGORY"))).otherwise(when(col("PRIMARY_CATEGORY").isin(L2ids:_*),col("PRIMARY_CATEGORY")).otherwise(lit(null)))).otherwise(lit(null)))